Ticket #104 (new defect)

Opened 5 years ago

Last modified 4 years ago

PageIndex problem

Reported by: dartar Owned by: unassigned
Priority: normal Milestone: 2.0
Component: actions Version: 1.1.6.1
Severity: normal Keywords: i18n alpha
Cc:

Description

DarTar and I have recently created some test pages, testing page names related to how Wikka evaluates 'valid' page names:  http://wikka.jsnx.com/,My,Page and  http://wikka.jsnx.com/ÄhnLich. Now look at where they show up on  http://wikka.jsnx.com/PageIndex... surely there shouldn't be two '#' indices?? --JavaWoman

I think that this is due to the encoding of the MySQL tables. I can't see how that would be fixed with a small touch on the code. If this is the matter then I hope that it also rings the i18n bell too. Translation to Greek is my bussiness ;) --GeorgePetsagourakis

Yes, my i18n bells were already loudly ringing... but you bring up a good point with MySQL. I hadn't thought about a real cause yet, just wanted to record what I saw before I got distracted again ;-). Sort order is a whole subject of its own within i18n .... --JavaWoman

Change History

Changed 4 years ago by DarTar

  • milestone changed from 1.1.6.2 to 1.2

needs a more general solution compatible with i18n

Changed 4 years ago by MovieLady

Brought this up on  ValidPageNames, but it would be more useful here, I think.

There's a clause in MySQL called  collate that can be used in SQL statements to connect the characters used with their correct sorting/comparison orders for various languages. This should be a relatively easy (I think, but don't quote me, as I haven't done anything with charsets in MySQL yet) way to fix that problem by using a defined constant for the  character sets supported by MySQL and then calling that constant in queries throughout the code, therefore providing more consistent and flexible support for international users. :)

 An example of collation's effects in an order by clause.

Changed 4 years ago by JavaWoman

The fact that ',My,Page' and 'ÄhnLich' show up under a # index in the first place is due to the RE used in pageindex to recognize starting alpha char for a page name: /[A-Za-z]/. This is inconsistent with both the REs used in the Formatter, and those in the Link() method (see #34, #71). A central RE library should have a separate define for a 'WikiName first character' which could be used in pageindex and re-used in the Formatter and Link() method (and possibly others) REs to recognize WikiNames. (I'm not sure why there would be two # indices though.)

Allowing other 'alpha' characters than /[A-Za-z]/ is also part of internationalization (#340).

Changed 4 years ago by JavaWoman

Exactly the same expression is use in the pageindex action so this will exhibit the same problem behavior.

Changed 4 years ago by JavaWoman

Scratch last comment... (typo)

Same expression used in mychanges and mypages actions so these will exhibit same problematic behavior.

Note: See TracTickets for help on using tickets.