Discussion:
encoding question
Allin Cottrell
2017-03-24 00:09:14 UTC
Permalink
Sorry, this is quite ticklish but I'll try to explain it as best I
can.

I'm not sure, from reading the gnuplot help on "encoding", of the
exact scope and effect of giving a "set encoding XXX" command in a
plot file.

Here's the context: my program writes a gnuplot command file,
designed to produce PNG output via the pngcairo "terminal", and
among the users of the program are people working on Windows in
Russian. There are two possible non-ASCII elements in the plot file:

1) the name of the output file (as in "set output 'OOO'"), which for
MS Windows in Russian will be encoded in CP1251; and

2) strings occurring in titles, labels or whatever in the body of
the plot: by default these will be in UTF-8, which is what pngcairo
expects.

At present I'm sticking a line into the plot file:

set encoding utf8

which I hope is going to tell gnuplot, "Whatever you might think
based on the fact that you're working on Windows in Russian, please
interpret titles/labels as being in UTF-8."

So here's the question: given that the output filename is in CP1251,
is my "set encoding" line liable to interfere with gnuplot's output
routine (for example, such that output cannot be written because
some non-ASCII component of the path is non-existent, if the bytes
are interpreted as UTF-8), or is gnuplot's I/O mechanism separate
and insulated from "set encoding"?

As you might expect, this is not merely hypothetical: I'm getting an
error report from a Russian Windows user, and I wonder if the fact
that wgnuplot.exe is exiting with a non-zero code when trying to
process a command file written by my program might have something to
do with a text encoding issue.
--
Allin Cottrell
Department of Economics
Wake Forest University
sfeam
2017-03-24 03:10:27 UTC
Permalink
Post by Allin Cottrell
Sorry, this is quite ticklish but I'll try to explain it as best I
can.
I'm not sure, from reading the gnuplot help on "encoding", of the
exact scope and effect of giving a "set encoding XXX" command in a
plot file.
Here's the context: my program writes a gnuplot command file,
designed to produce PNG output via the pngcairo "terminal", and
among the users of the program are people working on Windows in
1) the name of the output file (as in "set output 'OOO'"), which for
MS Windows in Russian will be encoded in CP1251; and
2) strings occurring in titles, labels or whatever in the body of
the plot: by default these will be in UTF-8, which is what pngcairo
expects.
set encoding utf8
which I hope is going to tell gnuplot, "Whatever you might think
based on the fact that you're working on Windows in Russian, please
interpret titles/labels as being in UTF-8."
That much is fine. It also has the effect, for the png terminal and
some others, that when you specify a font by name it will try to find
a version of it that uses your specified encoding.
Post by Allin Cottrell
So here's the question: given that the output filename is in CP1251,
is my "set encoding" line liable to interfere with gnuplot's output
routine (for example, such that output cannot be written because
some non-ASCII component of the path is non-existent, if the bytes
are interpreted as UTF-8), or is gnuplot's I/O mechanism separate
and insulated from "set encoding"?
Gnuplot does not care what is in the string used as a file name.
Linux/unix also does not care what is in the string used as a file name.
Any sequence of bytes is a legal filename even if is not printable.
Windows - I'm not so sure. There are two ways that it might go wrong
on windows that I have heard of, and I suppose they might interact
badly.
Caveat: I don't use Windows myself, so I'm only repeating what I have
seen mentioned elsewhere.

(1) Windows filesystems only allow certain encodings for file
names, and UTF-8 is not one of the allowed encodings.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx

(2) At least some incarnations of Windows used a magic byte sequence
known as BOM to indicate the encoding used by a text file. If your gnuplot
script file contains UTF-8 anything, some Windows machines are unhappy
if it does not start with BOM. On the other hand if it _does_ start with BOM
then strings in the script file that are really CP1251 rather than UTF-8
might (I am guessing) be converted inappropriately.

So I think your question is actually a Windows + script file format question
rather than anything specific to gnuplot. I doubt that "set encoding"
matters, but mixing UTF-8 and CP1251 in the same script file may
be intrinsically problematic on Windows.
Post by Allin Cottrell
As you might expect, this is not merely hypothetical: I'm getting an
error report from a Russian Windows user, and I wonder if the fact
that wgnuplot.exe is exiting with a non-zero code when trying to
process a command file written by my program might have something to
do with a text encoding issue.
Does the same script work if the file names it refers to are strictly ascii?

Ethan
Allin Cottrell
2017-03-24 18:57:11 UTC
Permalink
Post by sfeam
Post by Allin Cottrell
Sorry, this is quite ticklish but I'll try to explain it as best I
can.
I'm not sure, from reading the gnuplot help on "encoding", of the
exact scope and effect of giving a "set encoding XXX" command in a
plot file.
Here's the context: my program writes a gnuplot command file,
designed to produce PNG output via the pngcairo "terminal", and
among the users of the program are people working on Windows in
1) the name of the output file (as in "set output 'OOO'"), which for
MS Windows in Russian will be encoded in CP1251; and
2) strings occurring in titles, labels or whatever in the body of
the plot: by default these will be in UTF-8, which is what pngcairo
expects.
set encoding utf8
which I hope is going to tell gnuplot, "Whatever you might think
based on the fact that you're working on Windows in Russian, please
interpret titles/labels as being in UTF-8."
That much is fine. It also has the effect, for the png terminal and
some others, that when you specify a font by name it will try to find
a version of it that uses your specified encoding.
OK so far!
Post by sfeam
Post by Allin Cottrell
So here's the question: given that the output filename is in CP1251,
is my "set encoding" line liable to interfere with gnuplot's output
routine (for example, such that output cannot be written because
some non-ASCII component of the path is non-existent, if the bytes
are interpreted as UTF-8), or is gnuplot's I/O mechanism separate
and insulated from "set encoding"?
Gnuplot does not care what is in the string used as a file name.
Linux/unix also does not care what is in the string used as a file name.
Any sequence of bytes is a legal filename even if is not printable.
Windows - I'm not so sure. There are two ways that it might go wrong
on windows that I have heard of, and I suppose they might interact
badly.
Caveat: I don't use Windows myself, so I'm only repeating what I have
seen mentioned elsewhere.
(1) Windows filesystems only allow certain encodings for file
names, and UTF-8 is not one of the allowed encodings.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx
(2) At least some incarnations of Windows used a magic byte sequence
known as BOM to indicate the encoding used by a text file. If your gnuplot
script file contains UTF-8 anything, some Windows machines are unhappy
if it does not start with BOM. On the other hand if it _does_ start with BOM
then strings in the script file that are really CP1251 rather than UTF-8
might (I am guessing) be converted inappropriately.
So I think your question is actually a Windows + script file format question
rather than anything specific to gnuplot. I doubt that "set encoding"
matters, but mixing UTF-8 and CP1251 in the same script file may
be intrinsically problematic on Windows.
I ran an experiment to try to assess this. Booted Windows 8 (ugh) and
created a directory named Beauté (that's with an e-acute) on my
Desktop. I then created two copies of a simple gnuplot script to
produce a PNG file. Each included the line

set output 'c:/users/cottrell/desktop/Beauté/test.png'

(encoded in cp1251). The two files were identical except that one of
them included the line

set encoding utf8

before the "set output" line. (And the accented character in the
output filename was the only non-ASCII character in the files.)

I then called wgnuplot.exe on the two scripts from the command line in
a cmd.exe window. The one without "set encoding utf8" worked to
produce the PNG, the other didn't. To see what was happening I then
tried opening wgnuplot interactively and using the "load" command to
run the scripts. The variant without "set encoding" again worked fine;
the other one gave:

set output 'c:/users/cottrell/desktop/Beaut?/test.png'
cannot open file; output not changed

(note that in gnuplot's error message echoing the "set output" line
the e-acute has been changed to a question mark, actually not an
ASCII question mark but an "unrecognized glyph" symbol).

It therefore seems that "set encoding" has somehow altered gnuplot's
reading of the bytes in the output filename. (Once again, those bytes
are identical in the two files.) If gnuplot had simply passed the
incoming cp1251 bytes to the OS, surely the output file would have
been opened OK in both cases.

Allin Cottrell
Ethan A Merritt
2017-03-24 20:12:09 UTC
Permalink
Post by Allin Cottrell
Post by sfeam
Post by Allin Cottrell
So here's the question: given that the output filename is in CP1251,
is my "set encoding" line liable to interfere with gnuplot's output
routine (for example, such that output cannot be written because
some non-ASCII component of the path is non-existent, if the bytes
are interpreted as UTF-8), or is gnuplot's I/O mechanism separate
and insulated from "set encoding"?
Gnuplot does not care what is in the string used as a file name.
Linux/unix also does not care what is in the string used as a file name.
Any sequence of bytes is a legal filename even if is not printable.
Windows - I'm not so sure. There are two ways that it might go wrong
on windows that I have heard of, and I suppose they might interact
badly.
Caveat: I don't use Windows myself, so I'm only repeating what I have
seen mentioned elsewhere.
(1) Windows filesystems only allow certain encodings for file
names, and UTF-8 is not one of the allowed encodings.
https://msdn.microsoft.com/en-us/library/windows/desktop/dd317748(v=vs.85).aspx
(2) At least some incarnations of Windows used a magic byte sequence
known as BOM to indicate the encoding used by a text file. If your gnuplot
script file contains UTF-8 anything, some Windows machines are unhappy
if it does not start with BOM. On the other hand if it _does_ start with BOM
then strings in the script file that are really CP1251 rather than UTF-8
might (I am guessing) be converted inappropriately.
So I think your question is actually a Windows + script file format question
rather than anything specific to gnuplot. I doubt that "set encoding"
matters, but mixing UTF-8 and CP1251 in the same script file may
be intrinsically problematic on Windows.
I ran an experiment to try to assess this. Booted Windows 8 (ugh) and
created a directory named Beauté (that's with an e-acute) on my
Desktop. I then created two copies of a simple gnuplot script to
produce a PNG file. Each included the line
set output 'c:/users/cottrell/desktop/Beauté/test.png'
(encoded in cp1251). The two files were identical except that one of
them included the line
set encoding utf8
before the "set output" line. (And the accented character in the
output filename was the only non-ASCII character in the files.)
I then called wgnuplot.exe on the two scripts from the command line in
a cmd.exe window. The one without "set encoding utf8" worked to
produce the PNG, the other didn't. To see what was happening I then
tried opening wgnuplot interactively and using the "load" command to
run the scripts. The variant without "set encoding" again worked fine;
set output 'c:/users/cottrell/desktop/Beaut?/test.png'
cannot open file; output not changed
(note that in gnuplot's error message echoing the "set output" line
the e-acute has been changed to a question mark, actually not an
ASCII question mark but an "unrecognized glyph" symbol).
It therefore seems that "set encoding" has somehow altered gnuplot's
reading of the bytes in the output filename.
No, I don't think that is what is happening.
Post by Allin Cottrell
(Once again, those bytes
are identical in the two files.) If gnuplot had simply passed the
incoming cp1251 bytes to the OS, surely the output file would have
been opened OK in both cases.
What seems to be happening is that in syscfg.h on Windows it says
/* The unicode/encoding support requires translation of file names */
#define fopen win_fopen

and wmain.c:win_fopen() indeed tries to translate the name from the
current gnuplot encoding into Windows Unicode text.
I think the comment is wrong. File names should *not* be translated,
as you are finding out. The current gnuplot encoding is a separate
thing from the encoding used in the sourcecode of the script.

I only see this code in the development version, not in the source
for 5.0.5 or 5.0.6. So I guess your bug report is specifically for
the development version?

I'll defer to the Windows crowd here, but my tentative diagnosis
is that addition of a win_fopen() wrapper for fopen() in 5.1 should
be reverted.

Of course if you are seeing this same problem with 5.0 then my
diagnosis is wrong :-/

Ethan
Allin Cottrell
2017-03-24 21:09:22 UTC
Permalink
Post by Ethan A Merritt
Post by Allin Cottrell
I ran an experiment to try to assess this. Booted Windows 8 (ugh) and
created a directory named Beauté (that's with an e-acute) on my
Desktop. I then created two copies of a simple gnuplot script to
produce a PNG file. Each included the line
set output 'c:/users/cottrell/desktop/Beauté/test.png'
(encoded in cp1251). The two files were identical except that one of
them included the line
set encoding utf8
before the "set output" line. (And the accented character in the
output filename was the only non-ASCII character in the files.)
I then called wgnuplot.exe on the two scripts from the command line in
a cmd.exe window. The one without "set encoding utf8" worked to
produce the PNG, the other didn't. To see what was happening I then
tried opening wgnuplot interactively and using the "load" command to
run the scripts. The variant without "set encoding" again worked fine;
set output 'c:/users/cottrell/desktop/Beaut?/test.png'
cannot open file; output not changed
(note that in gnuplot's error message echoing the "set output" line
the e-acute has been changed to a question mark, actually not an
ASCII question mark but an "unrecognized glyph" symbol).
It therefore seems that "set encoding" has somehow altered gnuplot's
reading of the bytes in the output filename.
No, I don't think that is what is happening.
Post by Allin Cottrell
(Once again, those bytes
are identical in the two files.) If gnuplot had simply passed the
incoming cp1251 bytes to the OS, surely the output file would have
been opened OK in both cases.
What seems to be happening is that in syscfg.h on Windows it says
/* The unicode/encoding support requires translation of file names */
#define fopen win_fopen
and wmain.c:win_fopen() indeed tries to translate the name from the
current gnuplot encoding into Windows Unicode text.
I think the comment is wrong. File names should *not* be translated,
as you are finding out. The current gnuplot encoding is a separate
thing from the encoding used in the sourcecode of the script.
I only see this code in the development version, not in the source
for 5.0.5 or 5.0.6. So I guess your bug report is specifically for
the development version?
I'll defer to the Windows crowd here, but my tentative diagnosis
is that addition of a win_fopen() wrapper for fopen() in 5.1 should
be reverted.
Aha, this is very interesting! Yes, I'm using the development
version on Windows so your diagnosis seems very plausible. But
actually, now I (think) I understand what's going on, I _like_ the
idea behind win_fopen.

If I've got this right, it would let me standardize on consistently
UTF-8 gnuplot script files (including representing Windows paths in
UTF-8), and let gnuplot take care of recoding paths on the fly as
needed for interaction with the OS.

It's ugly and error-prone to mix text encodings in a single file,
but I guess that's what you have to do with gnuplot 5.0 if you want
(a) to represent titles, labels and so on in UTF-8, but (b) to
include Windows filenames that contain non-ASCII characters. It
sounds like gnuplot 5.1 could improve on that. I can now try the
experiment of keeping "set encoding utf8" but recoding Windows paths
to UTF-8 when writing them into a gnuplot script. If that works, I'm
happy!

(But of course if the win_fopen wrapper is preserved the backward
incompatibility needs to be made clear -- though it probably affects
rather few people.)

Allin Cottrell
Allin Cottrell
2017-03-25 23:42:32 UTC
Permalink
Post by Allin Cottrell
Post by Ethan A Merritt
Post by Allin Cottrell
I ran an experiment to try to assess this. Booted Windows 8 (ugh) and
created a directory named Beauté (that's with an e-acute) on my
Desktop. I then created two copies of a simple gnuplot script to
produce a PNG file. Each included the line
set output 'c:/users/cottrell/desktop/Beauté/test.png'
(encoded in cp1251). The two files were identical except that one of
them included the line
set encoding utf8
before the "set output" line. (And the accented character in the
output filename was the only non-ASCII character in the files.)
I then called wgnuplot.exe on the two scripts from the command line in
a cmd.exe window. The one without "set encoding utf8" worked to
produce the PNG, the other didn't. To see what was happening I then
tried opening wgnuplot interactively and using the "load" command to
run the scripts. The variant without "set encoding" again worked fine;
set output 'c:/users/cottrell/desktop/Beaut?/test.png'
cannot open file; output not changed
(note that in gnuplot's error message echoing the "set output" line
the e-acute has been changed to a question mark, actually not an
ASCII question mark but an "unrecognized glyph" symbol).
It therefore seems that "set encoding" has somehow altered gnuplot's
reading of the bytes in the output filename.
No, I don't think that is what is happening.
Post by Allin Cottrell
(Once again, those bytes
are identical in the two files.) If gnuplot had simply passed the
incoming cp1251 bytes to the OS, surely the output file would have
been opened OK in both cases.
What seems to be happening is that in syscfg.h on Windows it says
/* The unicode/encoding support requires translation of file names */
#define fopen win_fopen
and wmain.c:win_fopen() indeed tries to translate the name from the
current gnuplot encoding into Windows Unicode text.
I think the comment is wrong. File names should *not* be translated,
as you are finding out. The current gnuplot encoding is a separate
thing from the encoding used in the sourcecode of the script.
I only see this code in the development version, not in the source
for 5.0.5 or 5.0.6. So I guess your bug report is specifically for
the development version?
I'll defer to the Windows crowd here, but my tentative diagnosis
is that addition of a win_fopen() wrapper for fopen() in 5.1 should
be reverted.
Aha, this is very interesting! Yes, I'm using the development
version on Windows so your diagnosis seems very plausible. But
actually, now I (think) I understand what's going on, I _like_ the
idea behind win_fopen.
If I've got this right, it would let me standardize on
consistently UTF-8 gnuplot script files (including representing
Windows paths in UTF-8), and let gnuplot take care of recoding
paths on the fly as needed for interaction with the OS.
It's ugly and error-prone to mix text encodings in a single file,
but I guess that's what you have to do with gnuplot 5.0 if you
want (a) to represent titles, labels and so on in UTF-8, but (b)
to include Windows filenames that contain non-ASCII characters. It
sounds like gnuplot 5.1 could improve on that. I can now try the
experiment of keeping "set encoding utf8" but recoding Windows
paths to UTF-8 when writing them into a gnuplot script. If that
works, I'm happy!
The experiment was successful. I could create a clean UTF-8 encoded
gnuplot script (including a non-ASCII Windows path for "set
output"), and gnuplot's win_fopen handled interaction with the OS
correctly in the background. So I would definitely be in favor of
keeping win_fopen.

(Reminder for anyone trying to follow this: win_fopen is a special
facility in the development version of gnuplot. It has the effect of
recoding filenames in a gnuplot script from whatever is set via "set
encoding" to Windows-compatible 16-bit Unicode before they are
passed to the C-library function fopen().)

Allin Cottrell
sfeam
2017-04-02 03:41:39 UTC
Permalink
Post by Allin Cottrell
Post by Allin Cottrell
Post by Ethan A Merritt
Post by Allin Cottrell
I ran an experiment to try to assess this. Booted Windows 8 (ugh) and
created a directory named Beauté (that's with an e-acute) on my
Desktop. I then created two copies of a simple gnuplot script to
produce a PNG file. Each included the line
set output 'c:/users/cottrell/desktop/Beauté/test.png'
(encoded in cp1251). The two files were identical except that one of
them included the line
set encoding utf8
before the "set output" line. (And the accented character in the
output filename was the only non-ASCII character in the files.)
I then called wgnuplot.exe on the two scripts from the command line in
a cmd.exe window. The one without "set encoding utf8" worked to
produce the PNG, the other didn't. To see what was happening I then
tried opening wgnuplot interactively and using the "load" command to
run the scripts. The variant without "set encoding" again worked fine;
set output 'c:/users/cottrell/desktop/Beaut?/test.png'
cannot open file; output not changed
(note that in gnuplot's error message echoing the "set output" line
the e-acute has been changed to a question mark, actually not an
ASCII question mark but an "unrecognized glyph" symbol).
It therefore seems that "set encoding" has somehow altered gnuplot's
reading of the bytes in the output filename.
No, I don't think that is what is happening.
Post by Allin Cottrell
(Once again, those bytes
are identical in the two files.) If gnuplot had simply passed the
incoming cp1251 bytes to the OS, surely the output file would have
been opened OK in both cases.
What seems to be happening is that in syscfg.h on Windows it says
/* The unicode/encoding support requires translation of file names */
#define fopen win_fopen
and wmain.c:win_fopen() indeed tries to translate the name from the
current gnuplot encoding into Windows Unicode text.
I think the comment is wrong. File names should *not* be translated,
as you are finding out. The current gnuplot encoding is a separate
thing from the encoding used in the sourcecode of the script.
I only see this code in the development version, not in the source
for 5.0.5 or 5.0.6. So I guess your bug report is specifically for
the development version?
I'll defer to the Windows crowd here, but my tentative diagnosis
is that addition of a win_fopen() wrapper for fopen() in 5.1 should
be reverted.
Aha, this is very interesting! Yes, I'm using the development
version on Windows so your diagnosis seems very plausible. But
actually, now I (think) I understand what's going on, I _like_ the
idea behind win_fopen.
If I've got this right, it would let me standardize on
consistently UTF-8 gnuplot script files (including representing
Windows paths in UTF-8), and let gnuplot take care of recoding
paths on the fly as needed for interaction with the OS.
It's ugly and error-prone to mix text encodings in a single file,
but I guess that's what you have to do with gnuplot 5.0 if you
want (a) to represent titles, labels and so on in UTF-8, but (b)
to include Windows filenames that contain non-ASCII characters. It
sounds like gnuplot 5.1 could improve on that. I can now try the
experiment of keeping "set encoding utf8" but recoding Windows
paths to UTF-8 when writing them into a gnuplot script. If that
works, I'm happy!
The experiment was successful. I could create a clean UTF-8 encoded
gnuplot script (including a non-ASCII Windows path for "set
output"), and gnuplot's win_fopen handled interaction with the OS
correctly in the background. So I would definitely be in favor of
keeping win_fopen.
(Reminder for anyone trying to follow this: win_fopen is a special
facility in the development version of gnuplot. It has the effect of
recoding filenames in a gnuplot script from whatever is set via "set
encoding" to Windows-compatible 16-bit Unicode before they are
passed to the C-library function fopen().)
Allin Cottrell
Can you suggest where in the documentation we could add this information?
Putting it under "encoding" will not help unless the poor user who hits it
has already diagnosed it as an encoding problem.
Where did you look when you first hit the original problem?

Ethan
p***@piments.com
2017-04-02 08:08:22 UTC
Permalink
Post by sfeam
Post by Allin Cottrell
Post by Allin Cottrell
Post by Ethan A Merritt
Post by Allin Cottrell
I ran an experiment to try to assess this. Booted Windows 8 (ugh) and
created a directory named Beauté (that's with an e-acute) on my
Desktop. I then created two copies of a simple gnuplot script to
produce a PNG file. Each included the line
set output 'c:/users/cottrell/desktop/Beauté/test.png'
(encoded in cp1251). The two files were identical except that one of
them included the line
set encoding utf8
before the "set output" line. (And the accented character in the
output filename was the only non-ASCII character in the files.)
I then called wgnuplot.exe on the two scripts from the command line in
a cmd.exe window. The one without "set encoding utf8" worked to
produce the PNG, the other didn't. To see what was happening I then
tried opening wgnuplot interactively and using the "load" command to
run the scripts. The variant without "set encoding" again worked fine;
set output 'c:/users/cottrell/desktop/Beaut?/test.png'
cannot open file; output not changed
(note that in gnuplot's error message echoing the "set output" line
the e-acute has been changed to a question mark, actually not an
ASCII question mark but an "unrecognized glyph" symbol).
It therefore seems that "set encoding" has somehow altered gnuplot's
reading of the bytes in the output filename.
No, I don't think that is what is happening.
Post by Allin Cottrell
(Once again, those bytes
are identical in the two files.) If gnuplot had simply passed the
incoming cp1251 bytes to the OS, surely the output file would have
been opened OK in both cases.
What seems to be happening is that in syscfg.h on Windows it says
/* The unicode/encoding support requires translation of file names */
#define fopen win_fopen
and wmain.c:win_fopen() indeed tries to translate the name from the
current gnuplot encoding into Windows Unicode text.
I think the comment is wrong. File names should *not* be translated,
as you are finding out. The current gnuplot encoding is a separate
thing from the encoding used in the sourcecode of the script.
I only see this code in the development version, not in the source
for 5.0.5 or 5.0.6. So I guess your bug report is specifically for
the development version?
I'll defer to the Windows crowd here, but my tentative diagnosis
is that addition of a win_fopen() wrapper for fopen() in 5.1 should
be reverted.
Aha, this is very interesting! Yes, I'm using the development
version on Windows so your diagnosis seems very plausible. But
actually, now I (think) I understand what's going on, I _like_ the
idea behind win_fopen.
If I've got this right, it would let me standardize on
consistently UTF-8 gnuplot script files (including representing
Windows paths in UTF-8), and let gnuplot take care of recoding
paths on the fly as needed for interaction with the OS.
It's ugly and error-prone to mix text encodings in a single file,
but I guess that's what you have to do with gnuplot 5.0 if you
want (a) to represent titles, labels and so on in UTF-8, but (b)
to include Windows filenames that contain non-ASCII characters. It
sounds like gnuplot 5.1 could improve on that. I can now try the
experiment of keeping "set encoding utf8" but recoding Windows
paths to UTF-8 when writing them into a gnuplot script. If that
works, I'm happy!
The experiment was successful. I could create a clean UTF-8 encoded
gnuplot script (including a non-ASCII Windows path for "set
output"), and gnuplot's win_fopen handled interaction with the OS
correctly in the background. So I would definitely be in favor of
keeping win_fopen.
(Reminder for anyone trying to follow this: win_fopen is a special
facility in the development version of gnuplot. It has the effect of
recoding filenames in a gnuplot script from whatever is set via "set
encoding" to Windows-compatible 16-bit Unicode before they are
passed to the C-library function fopen().)
Allin Cottrell
Can you suggest where in the documentation we could add this information?
Putting it under "encoding" will not help unless the poor user who hits it
has already diagnosed it as an encoding problem.
Where did you look when you first hit the original problem?
Ethan
If there are cross-platform issues, then maybe the doc should contain a
section on that, at least as a central point linking to more specific
information on individual issues.

Also "encoding" is a rather programmer's or solution derived perspective
not a user's perspective. The user probably does not think he is
"encoding" anything.

From the user's point of view it is probably do with accented vowels,
natural language or non-English language filenames, labels or whatever.

I have commented on a few such cases in the past where the info is there
but you need to know the answer in order to find it because it is not
linked to anything related to the user's problem.


Gnuplot makes platform abstraction fairly transparent but there always
seems to be a few things which are not completely OS agnostic.

Peter.

Loading...