Summary: | : Writer 4 cannot find regular expressions like \xAD or \x00AB | ||
---|---|---|---|
Product: | LibreOffice | Reporter: | AndreHasekamp |
Component: | Linguistic | Assignee: | Michael Stahl <mst.fdo> |
Status: | VERIFIED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | bormant, erack, gerard.fargeot, guilleron29, mst.fdo |
Version: | 4.0.0.3 release | Keywords: | regression |
Hardware: | Other | ||
OS: | Linux (All) | ||
Whiteboard: | BSA target:4.2.0 target:4.1.4 target:4.0.6 | ||
i915 platform: | i915 features: |
Description
AndreHasekamp
2013-05-12 15:32:39 UTC
I don't remember that \x combinations have ever worked. (In reply to comment #1) > I don't remember that \x combinations have ever worked. It work with OOo and with LibreOffice until 4.0.x (x?) Works with 3.6.6, fail with 4.0.1 Hi, ICU regexp engine is a new feature for LO 4, which replace the custom engine. See: http://www.libreoffice.org/download/4-0-new-features-and-fixes/ in Options/General, where can be found: http://userguide.icu-project.org/strings/regexp#TOC-Regular-Expression-Metacharacters. Have a nice day, Jacques Guilleron A difference with hexa values for find and replace: If now (in LO 4.0.2.2) I enter \x00AB, the character will not be found, but \xAB will. Unfornutatly, for \xAD, that don't work. I can find it only if I enter the character directly. There is perhaps others diffenrences. Jacques Guilleron Hi, I'm afraid I'll have to study this ICU document first; never seen it before. So, consider this bug 64495 withdrawn. Kind regards, Andre Hasekamp. -----Original Message----- From: bugzilla-daemon <bugzilla-daemon@freedesktop.org> To: AndreHasekamp <AndreHasekamp@netscape.net> Sent: Sun, May 12, 2013 10:59 pm Subject: [Bug 64495] : Writer 4 cannot find regular expressions Comment # 3 on bug 64495 from " Jacques Guilleron Hi, ICU regexp engine is a new feature for LO 4, which replace the custom engine. See: http://www.libreoffice.org/download/4-0-new-features-and-fixes/ in Options/General, where can be found: http://userguide.icu-project.org/strings/regexp#TOC-Regular-Expression-Metacharacters. Have a nice day, Jacques Guilleron You are receiving this mail because: You reported the bug. Eike, is there any problem here with "\x..." regex search that needs fixing? Help needs to be updated.. The ICU regular expressions are slightly different in details from the home-brewed OOo expressions, for this example the four hex digits following the \x are not accepted, \x accepts only two hex digits for values <=255, so \xhh. More hex digits (1-6) are accepted in the form \x{hhhhhh}. These two forms are actually identical with Perl regular expressions. As an ICU Unicode extension also the form \uhhhh with exactly four hex digits can be used, or \Uhhhhhh with exactly six hex digits. For more details see the mentioned metacharacters URL http://userguide.icu-project.org/strings/regexp#TOC-Regular-Expression-Metacharacters For the \xAD that according to comment 4 does not work I'm not sure, is a soft-hyphen even part of the text? Isn't it only generated by word breaking? Does \u00AD find it? On the other hand I spotted some special treatment of 0x00AD in sw/source/core/crsr/findtxt.cxx SwPaM::DoSearch() that for if bRegSearch is supposed to set bRemoveSoftHyphens = false; *** Bug 63261 has been marked as a duplicate of this bug. *** Michael Stahl committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/core/commit/?id=dca5163b6ef206ceb1f2d56feb7546c1929afe60 fdo#64495: sw: fix regex search for soft hyphen \xAD The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. Michael Stahl committed a patch related to this issue. It has been pushed to "libreoffice-4-1": http://cgit.freedesktop.org/libreoffice/core/commit/?id=386d0c5d663fe50295be3714977a54b86212f766&h=libreoffice-4-1 fdo#64495: sw: fix regex search for soft hyphen \xAD It will be available in LibreOffice 4.1.4. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. Michael Stahl committed a patch related to this issue. It has been pushed to "libreoffice-4-0": http://cgit.freedesktop.org/libreoffice/core/commit/?id=6add0104e250fd8653a93450d371404aa3ff3a6c&h=libreoffice-4-0 fdo#64495: sw: fix regex search for soft hyphen \xAD It will be available in LibreOffice 4.0.7. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. Michael Stahl committed a patch related to this issue. It has been pushed to "libreoffice-4-0-6": http://cgit.freedesktop.org/libreoffice/core/commit/?id=730c5696c6c668c88ed071fed6f3598f0b4a2aa1&h=libreoffice-4-0-6 fdo#64495: sw: fix regex search for soft hyphen \xAD It will be available already in LibreOffice 4.0.6. The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. so it turned out that the soft-hyphen (\xAD) needs special handling code in Writer, which is fixed now. you can use any of the ICU supported Unicode literal syntax, e.g. \xAD \x{00AD} \u00AD \U000000AD \N{SOFT HYPHEN} but the legacy syntax \x00AD is no longer supported and that will not be fixed. have now adapted the help content on master accordingly to document \uXXXX. Michael Stahl committed a patch related to this issue. It has been pushed to "master": http://cgit.freedesktop.org/libreoffice/help/commit/?id=f81edbd66fc4d0b6cf03949bb2339c9be9ee989c fdo#64495: help: regex \xXXXX is no longer supported The patch should be included in the daily builds available at http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More information about daily builds can be found at: http://wiki.documentfoundation.org/Testing_Daily_Builds Affected users are encouraged to test the fix and report feedback. Hello, Verified with LO 4.2.0.0.alpha0+ Build ID: 71e1c79acebab5fc6a31457416c24c4a33141c33 TinderBox: Win-x86@42, Branch:master, Time: 2013-10-27_23:53:26 Thank you Michael for time passed on fixing this. Jacques setting to verified as of comment #15. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.