zenphoto forums » Plugins

xmpMetadata and (X)HTML character references

(11 posts)

Tags:

  1. guilhem

    Junior
    Joined: Jan '12
    Posts: 8

    Hello,

    I'm wondering, since XMP is XML-friendly, shouldn't xmpMetadata decode HTML characters references? exiftool -tagsfromfile img.jpg img.xmp produces a XMP where & (&), ' ('), " ("), > (>), and < (<) are escaped. On the other hand exiv2 ex -e xX img.jpg is fine with quotes, but escapes linefeed (&#xa) among others.

    Or perhaps there another way to circumvent HTML character references?
    Thanks anyway!

    Posted 1 year ago #
  2. Zenphoto development team
    sbillard

    Chief Developer
    Joined: May '07
    Posts: 9,818

    The plugin probably should be decoding HTML entities. We will add that to the list.

    Don't forget to read the Forum rules and usage resources
    Posted 1 year ago #
  3. guilhem

    Junior
    Joined: Jan '12
    Posts: 8

    All right thanks ;) If you want me to open a ticket just let me know.

    Posted 1 year ago #
  4. Zenphoto development team
    sbillard

    Chief Developer
    Joined: May '07
    Posts: 9,818

    Normally, yes, but I have made the change and it will be in the nightly tonight. I would appreciate some testing, though.

    Don't forget to read the Forum rules and usage resources
    Posted 1 year ago #
  5. guilhem

    Junior
    Joined: Jan '12
    Posts: 8

    Wow, that was fast!
    I checked it against https://en.wikipedia.org/wiki/Character_entity_reference, and as far I saw the entity and numeric (decimal and hexadecimal) references are rendered properly, except one thing: the & (&, &#38 and &#x26) "eats" one character too many in some (!) cases: try eg, to render &'&&apos;&x&&sect;&e. On the other hand, &a&a looks fine

    Posted 1 year ago #
  6. Zenphoto development team
    sbillard

    Chief Developer
    Joined: May '07
    Posts: 9,818

    It is hard to read/write html entities on a website. But it seems to me that what you are describing is that the translation fails when you have a naked ampersand preceding an entity. That is, of ocurse, not legal--ampersand is supposed to be represented by &amp;

    Don't forget to read the Forum rules and usage resources
    Posted 1 year ago #
  7. guilhem

    Junior
    Joined: Jan '12
    Posts: 8

    Oops, sorry for the mess. No I mean, if you write an entity that represents the ampersand, then in some cases the character that immediately follows is ignored. Try e.g., to render https://pastebin.com/raw.php?i=y77sUUcB: the first line is messed up, while the second is fine.

    Posted 1 year ago #
  8. Zenphoto development team
    sbillard

    Chief Developer
    Joined: May '07
    Posts: 9,818

    It looks like it is rendering correctly to me. However remember that the output may cause you issues: &§ is not valid HTML

    Don't forget to read the Forum rules and usage resources
    Posted 1 year ago #
  9. guilhem

    Junior
    Joined: Jan '12
    Posts: 8

    Ah? I know that is not valid HTML, but with the first line of my above paste (it's raw, there is no translation), I would expect &'&'&x&§&e, but I get http://i.imgur.com/hYVsx.png.
    Don't you get the same result?

    Posted 1 year ago #
  10. Zenphoto development team
    sbillard

    Chief Developer
    Joined: May '07
    Posts: 9,818

    No, I do not, I get as you expect. I am guessing that what you see is a result of the browser tyring to interpret the &§

    Anyway, I did my tests by saving the result to a disk file to keep the browser out of the picture.

    But I did notice that &apos; does not get converted. So maybe we really need a full XML character table and not just the PHP html_entities_decode() I'll work on that.

    Don't forget to read the Forum rules and usage resources
    Posted 1 year ago #
  11. guilhem

    Junior
    Joined: Jan '12
    Posts: 8

    All right, thanks for your work.

    Posted 1 year ago #

RSS feed for this topic

Reply

You must log in to post.