I know it's very complicated, but many times we need to search for just a word
in a pdf/ps and we cann't.
As I know, no viewer support RtL scripts yet.
------- From Behdad Esfahbod 2005-03-15 10:31 -------
Maybe a first step is to simply support searching/copy/paste Unicode strings.
That should quite possible give the encoding vector of PS (and PDF) fonts. SVG
should have no problem I guess.
= Additional Comments from http://bugzilla.gnome.org/show_bug.cgi?id=300536 =
Reporter: firstname.lastname@example.org (Roee)
Please describe the problem:
When using the search function in an hebrew document the typing of the searched
text should be entered in backwards in order to make a search in the documnet.
Steps to reproduce:
1. Open a document containing hebrew text.
2. Try to search for a word
The only posible way to search is when the typing is done backwards
The search should find words in their correct typing
Does this happen every time?
GNOME bug http://bugzilla.gnome.org/show_bug.cgi?id=313230 is a duplicate of
I just wish to add that reversing the word when a RTL script is entered is not
enough! Poppler should implement the Unicode BiDi algorithm to support search
strings which contain both LTR and RTL scripts. There is an implementation named
fribidi you can use.
> As I know, no viewer support RtL scripts yet.
Adobe Acrobat Reader supports searching and copying Arabic text perfectly.
> There is an implementation named fribidi you can use.
Freebidi works very well, and and it's a Freedesktop project:
Why not use it in Poppler?
Note that FriBidi converts from logical (input text) order to visual (glyph) order. The problem in poppler is reverse-bidi. That is, going back from the visual order as found in the PDF to the logical text order. Poppler does an ok job at that. It sure can be improved, but fribidi is no magic bullet here. I wrote about this a bit here:
search for reverse-bidi.
An alternative would be to make poppler use fribidi to find the visual order for the search text, then match that against the visual order of extracted text. But that's against the current code and does not yield much immediate benefits.
he only thing that I can think of that can improve poppler's behavior by using fribidi is mirroring characters like brackets when found around RTL text. That's all for now.
> The problem in poppler is reverse-bidi. That is, going back from
> the visual order as found in the PDF to the logical text order.
> Poppler does an ok job at that.
If I want to search for the string
in a PDF using evince, I have to type it backwards in the search field:
This is completely broken behaviour. How can it be considered doing "an OK job"?
Ah I thought it's changed in the mean time.
Poopler hackers, how does the search work? I thought the search word is matched against the text extracted using the text device? If it's done that way it should work fairly ok.
is their any plaining for fix? it's really a big problem for RTL language user with evince
please help with that, some one is offering a solution here I think:
i've just posted a patch to implement visual to logical text conversion, that migh become a step towards this problem solution.
Created attachment 68861 [details] [review]
find bidirectional text
a small workaround for searching rtl text. limited for mixed directional text.
say ABC 123 will render by fribidi as 123 ABC.
to search for this text in poppler, you'd need to search literally 123 CBA before this patch.
with this patch, search for ABC 123 as entered. nice.
but if you only search for ABC 12, nothing would be found. that's because this patch transforms the searched text from logical to visual before the actual search in the visual text inside poppler, so ABC 12 would render to 12 CBA, that's not there.
there's a better way to go, which i'll implement later. this would also help with bidi text select and copy.
this patch will only work if you first apply my last patch to bug 55977. you also need fribidi or preferably icu.
Adding the depends for the current patch dependency, the bug itself is not dependent but current solution by alex is.
the last fix for #55977 will be enough to fix this bug too.
again, it's a partial solution for mixed direction text.
seriously, people. we're in 2014.
open this .pdf file in your browser - such as Firefox or Chrome -
and behold - search functions works flawlessly.
why can't these be patched to Evince/Poppler?
Because there's no patch for it. alex has proven he can't provide a valid patch.
I attached two patches to bug 55977 that should handle searching RTL text to a reasonable level, I wonder if this bug is more appropriate bug for those two patches?
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/274.