I would like to use fc-match to figure out how substitution works in a case the application fills charset field in a pattern, i.e. I see that filenames in my browser have different fonts depending whether they are in English, or Russian. I tried issuing fc-match "Bitstream Vera Sans:charset=something" But I got 'segmentation fault'. I tried fc-list : charset to figure out what I can put there, but it gave no guess.
Yes, there's no human-readable string representation for charsets. Now that the cache doesn't strings, perhaps we can replace the old nasty representation with something sensible.
I fixed the segfault at least; still remaining to be decided is how to present charsets in a sensible fashion.
maybe supporting the well-known charsets name like ISO8859-* would be more useful. or the block name in Unicode since there are no fonts covering everything in the world.
(In reply to comment #3) > maybe supporting the well-known charsets name like ISO8859-* would be more > useful. or the block name in Unicode since there are no fonts covering > everything in the world. Nah, ISO8859-* is not that interesting, and would need data tables that I really want to see die forever. Unicode blocks are not interesting because of alll the holes and rare characters. You rarely find any font supporting a full block, except for the ASCII and Latin1 blocks maybe.
Sure. well, the side-effect of supporting this might be that there are possibility to improve giving a rate to select the better fonts. right now fontconfig has the orth files per languages. I think this direction is right because rendering characters with different fonts per charset where we have seen in X core fonts was really ugly. however it has a dilemma of the strict orthography vs the lazy orthography like Bug#17619. we still need some input from someone through the fontconfig config to determine which one they prefer from the aspect of the quality etc though, how many charsets for the specific language the font support is measurable and supporting more charsets should be preferred. For example, there are some charsets in Japanese like JIS X 0201, JIS X 0208, JIS X 0212, JIS X 0213 and some revisions on them. 0201 and 0208 is a must to support Japanese though, 0212 and 0213 may be optional in most cases. but nice to have it. So I'd suggest to have separate tables for charsets and link to the orth file with some information to indicate a mandatory or an optional. and give a different rate for them and select the better fonts against it then. or maybe even good to have a way to do it per character code in the config. well, it's off topic for this issue though.
commit e708e97c351d3bc9f7030ef22ac2f007d5114730 Author: Behdad Esfahbod <behdad@behdad.org> Date: Thu Jul 3 17:52:54 2014 -0400 Change charset parse/unparse format to be human readable Previous format was unusable. New format is ranges of hex values. To choose space character and Latin capital letters for example: $ fc-pattern ':charset=20 41-5a' Pattern has 1 elts (size 16) charset: 0000: 00000000 00000001 07fffffe 00000000 00000000 00000000 00000000 00000000 (s)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.