Jan

3

Software Design Paralysis

Friends, I have come to an impasse.

I have an awesome concept for embedding images into these blog posts. The idea is this:

  • Use an Attachment model with a generic foreign key to allow images to be attached to blog posts
  • Use a markdown extension to allow attached files to be referenced inside the blog post's body

Here's the syntax I wanted to support in markdown:



This isn't that impossible a task, and in fact, I have done it, but there are some wrinkles. The main issue is that I really want to limit my markdown extension's file lookup to only those files actually attached to the blog post. The second issue is that I would like to be able to specify a template to use for rendering each attachment, and I'd like to be able to specify different templates in different contexts. Both problems revolve around how I should do the actual conversion from markdown syntax to HTML.

I implemented my solution as a markdown extension because I wanted to continue to use the markdown template filter. I'm already using it with the pygments extension I discussed earlier:

{{ blogpost.body|markdown:"pygments" }}

I was imagining that I could implement my attachments extension and then just change the code like so:

{{ blogpost.body|markdown:"pygments,attachments" }}

That's nice and tidy, right? But how will the attachments extension know which blogpost it's operating on? Attachments have generic foreign keys. How will it even know it's operating on a blogpost? And what about specifying a template for rendering those attachments? Blast! We will have to complicate the solution.

It turns out that markdown.py's extension system supports the passing of config parameters. If I pass three variables: the content type id, the object id, and the template, then I'll be able to identify which blog post to use, and which template to use. This is how that would look in markdown's extension config param syntax:

{{ blogpost.body|markdown:"pygments,attachments(ct=5,id=3,template=blog/attach.html" }}

Problem the first: the content type and id values are hard-coded in there. I can't drop variables into a string that's being passed to a template filter. Problem the second: django splits this big string argument into a list of extensions to pass to markdown by splitting on ',', so my config params are interpreted as extension boundaries. There are other mysteries nestled within markdown.py's extension param handling system, but I'll talk about those in another post of I ever come out with a final draft for this dang thing.

So, I can only specify one parameter per extension if I'm passing it in through the template filter. I can't pass in the blogpost identifying data there anyway, so this works out: I'd just pass in the name of the template I want it to use for rendering the attachments. But that still leaves the problem of identifying the blog post to the attachments extension. Without that information, the extension will be looking up the specified filename in the set of all attachments in the system. The larger the site, the greater the risk of namespace collisions.

I think there are three solutions to this problem, and I don't like any of them all that much:

  1. Implement a method on BlogPost that converts its body text from markdown to html, applying the appropriate markdown extensions, taking one optional parameter: the template to use for the attachments. If I wanted to do this, I'd have to figure out some way of invoking the method with a template argument from within a template. Here's the least awful hack I can think of for that: [sourcecode:django] {{ blogpost|render_body:"blog/attach.html" }} [/sourcecode] That template filter could just invoke the method on the blogpost object, passing that variable. It works, but it's ugly, and it replaces a standard template filter (markdown) with a custom template filter that depends on a custom method.
  2. Write a template tag for rendering the body of a blogpost. That would look like this: [sourcecode:django] {% render_body blogpost "blog/attach.html" %} [/sourcecode] That's less of a hack than the first option, and I can keep all the markdown invocations inside the template tag, which spares the BlogPost model from having to deal with it. It's still a custom template tag replacing a standard template filter. This wouldn't be so bad except that inside the template tag I'm still invoking the markdown by passing it a list of extension names. My attachments extension has to be live in the special markdown extension format of a file called mdx_attachments.py that lives in the root dir of some place in the python path. If I peel back the layers of markdown even further I could pass in my extension object directly without this mdx file, but it feels silly to put all this effort into disabling the handy extension loading functionality that I'm getting for free from markdown.py
  3. The last option is to try and stay compatible with markdown and the markdown template filter. I could pass the object identifying information via a template tag provided by the attachments app. It would look like this: [sourcecode:django] {{ blogpost.body|annotate:blogpost|markdown:"pygments,attachments(template=blog/attach.html)" }} [/sourcecode] This makes for verbose, less legible template code, but it keeps my interference in the markdown system to a minimum. The additional problem is how the annotation would work. Other than doing something crazy like subclassing `str` and adding the variables there, I think the only way would be to add a line to the top of the text that looked something like this: [sourcecode:markdown] [attach_object:5,3] [/sourcecode] The arguments here are the content type id and the object id. With this, the attachment extension could identify the object first, before it began handling the `[attach:(9,34)xxxx]` elements. I think I like this solution best. Is this crazy?

The solution I have now is #2, but I think I might rewrite it to #3. Before I introduce more churn into my code, I'm hoping I can get some feedback from other developers.

Why not turn autoescaping off and stick in an img tag?

I kind of don't understand. Not just your mission, here, but the whole fear of storing (X)HTML in the database. First of all, this is your blog, right? You can trust yourself to enter clean HTML, can't you? You're not going to try to hijack your own page with Javascript, I trust?

And even if it's not just content you're creating yourself, it seems to me much easier to restrict someone to a subset of HTML than to extend Markdown to do accommodate particular use cases. As you're showing here.

Bottom line: I say it's way preferable to make HTML more Markdown-like than to make Markdown more HTML-like.

— Hank Sims (January 03, 2009 at 1:39 a.m.)

I too must be missing something. Why not store you original blog post (assumedly in mardown) and the html version both in the database? You edit the markdown version and serve up the html version, which is generated when you save.

— Jason Christa (January 03, 2009 at 9:32 a.m.)

Why do you need the generic foreign key association at all? If your attachments have unique names, and you refer to them by name in your custom Markdown syntax, what's the FK actually buy you? An extra bit of repetitive work for every attachment to every blog post?

— Carl M (January 04, 2009 at 4:31 p.m.)

While I agree with others here that you're probably over-engineering this for a personal blog app, this is a real problem that I have given limited thought to.

The fact is, the way Django filters work, the markdown filter cannot take custom config options. I've always figured that this could be handled either in the model or view level. But you provide an interesting edge case; passing in something defined in the template.

I'm assuming that you want the template author to be able to define the template used, therefore you want the setting for which template to use also defined in a template. Remember, if you hardcode the template name, the template author can still create/edit a template of the same name which will replace/override the default. In fact, this is generally how third party apps allow template overrides in my experience anyway. So, yeah, even with your edge case, it's still over-engineered. Just hardcode the template name and if it's really important, make it a setting in settings.py that falls back on a default if undefined.

As a side point: I notice you have some issues/concerns with the way Markdown's config options work. Could you please pass those on (email them either to me or the python-markdown list) so we can address them before we release 2.0 (happening soonish) and lock in the API. As one of the core devs, I understand how they work intimately, so I may not always see how others may view the API. Being that we've never received any comments about it before (that I'm aware of), I'm curious what your issues/concerns/difficulties were.

Waylan (January 04, 2009 at 4:57 p.m.)