What's invalid in my .pot file?

Asked by Stuart P. Bentley

Rosetta sent:
"We were unable to import the file because of errors in its format:

Line 19: Could not decode input from UTF-8

(snip)

For your convenience, you can get the file you uploaded at:
http://launchpadlibrarian.net/30753700/template.pot"

What's wrong about this file? From what I gathered from the gettext manual it should be fine.

Question information

Language:
English Edit question
Status:
Solved
For:
Launchpad itself Edit question
Assignee:
No assignee Edit question
Last query:
Last reply:
Revision history for this message
Jeroen T. Vermeulen (jtv) said :
#1

Thanks for including the link to the file—makes it much easier to diagnose the problem without a lot of slow back-and-forth!

The error means that there's an encoding problem. Launchpad expects the file to be in UTF-8, but trying to read it as such fails.

Now, I don't see anything in the file that looks like it's not ASCII. But I do see that it specifies its character set as "charset=CHARSET". Try replacing that with "charset=UTF-8"... does that solve the problem? If so, this may be a small bug in how we report errors with unknown character sets. If not, we'll keep looking for weird characters.

By the way, I still expect the import to fail when this is fixed, because the syntax for specifying message contexts is wrong. (BTW, I think we discussed this a long time ago: I do think you have the right approach for representing symbolic, non-English msgids in gettext here). But instead of:

msgctext strings.combos.box1.lua

the file should say:

msgctxt "strings.combos.box1.lua"

—so no "e" in "msgctxt," and quotes around the context string.

Jeroen

Revision history for this message
Stuart P. Bentley (stuart) said :
#2

> Now, I don't see anything in the file that looks like it's not ASCII.
> But I do see that it specifies its character set as "charset=CHARSET".
> Try replacing that with "charset=UTF-8"... does that solve the problem?
> If so, this may be a small bug in how we report errors with unknown
> character sets. If not, we'll keep looking for weird characters.
>
That's probably it. It was hard to tell looking at the standard header what
was supposed to be filled in and what isn't (seeing as how *every official
script* leaves the top 3 lines nearly completely unchanged (some of the FSF
ones change the copyright name).

> By the way, I still expect the import to fail when this is fixed,
> because the syntax for specifying message contexts is wrong. (BTW, I
> think we discussed this a long time ago: I do think you have the right
> approach for representing symbolic, non-English msgids in gettext here).
> But instead of:
>
> msgctext strings.combos.box1.lua
>
> the file should say:
>
> msgctxt "strings.combos.box1.lua"
>
> —so no "e" in "msgctxt," and quotes around the context string.
>

D'oh!! (You have no idea how much grief this mistake caused me when
searching the documentation. You'd think I would have caught on.)

--------------------------------------------------
From: "Jeroen T. Vermeulen" <email address hidden>
Sent: Monday, August 24, 2009 P9:40
To: <email address hidden>
Subject: Re: [Question #80849]: What's invalid in my .pot file?

> Your question #80849 on Launchpad Translations changed:
> https://answers.edge.launchpad.net/rosetta/+question/80849
>
> Status: Open => Needs information
>
> Jeroen T. Vermeulen requested for more information:
> Thanks for including the link to the file—makes it much easier to
> diagnose the problem without a lot of slow back-and-forth!
>
> The error means that there's an encoding problem. Launchpad expects the
> file to be in UTF-8, but trying to read it as such fails.
>
> Now, I don't see anything in the file that looks like it's not ASCII.
> But I do see that it specifies its character set as "charset=CHARSET".
> Try replacing that with "charset=UTF-8"... does that solve the problem?
> If so, this may be a small bug in how we report errors with unknown
> character sets. If not, we'll keep looking for weird characters.
>
> By the way, I still expect the import to fail when this is fixed,
> because the syntax for specifying message contexts is wrong. (BTW, I
> think we discussed this a long time ago: I do think you have the right
> approach for representing symbolic, non-English msgids in gettext here).
> But instead of:
>
> msgctext strings.combos.box1.lua
>
> the file should say:
>
> msgctxt "strings.combos.box1.lua"
>
> —so no "e" in "msgctxt," and quotes around the context string.
>
>
> Jeroen
>
> --
> To answer this request for more information, you can either reply to
> this email or enter your reply at the following page:
> https://answers.edge.launchpad.net/rosetta/+question/80849
>
> You received this question notification because you are a direct
> subscriber of the question.
>

Revision history for this message
Matthew Revell (matthew.revell) said :
#3

This appears to be solved.