ÿØÿà JFIF    ÿÛ „ !.%+&8&+/1555$;@;4?.451 4,$,44444444444414444444444444444444444444444444444444ÿÀ  á á" ÿÄ     ÿÄ ?    !1AQaq"2‘¡±ÁðBRbrÑá#‚’¢²3S CñÿÄ   ÿÄ !    !1QAa‘2ÿÚ   ? 5˜Z¯V¦cø)›t/? z¨±>Õ5€¶‹Á¤·¼z¼Ü¬+ñ®v¤¨_ˆR­BFn©—˜ý®ç̝P8gýt·ÉSTŦˆìät?þé¼íìN/Þa)ì–í6ô… Ï¿øÃj´¿KÇü]ÿ ªô¹-eKànëÕHTx}ýSÜ›ÿ ”7Ø×&µ<¦  ¥ÑO¶[Ù¯ä¨ÞÃÿ PZ-¬;#õ|•oaÿ ©CìÞz3˜öː/¤­ñTûIØ}š^ mÓ%ªxˆ¥ÉŸu=Z+ISe¿45™¼u;ú&WØ÷€æßQ™®{|íx*TC“#ZŠìZ§²‹ 6pv…³¿¡äª*áZÐ%ÒOáˆo"x«OHk w±æ+¬V(kMúŸ5Vö«$ ÁrÏbàb57/luR ¸ÑÛj Òµì`Мq­û žICÀÊ•©4€Âcà¨Ï€O´<èÐ:›ù(Ë^L8þ‘ÍÌ#¸Ð_Ì©ÙK(Öz 4¬û+¸;ü’V’84‘¬ÃŽ:[â‡ÔÌáõp¢~§ªlæ£ö{®G>J¼"°‡7¯ÆÉèßû ‹É‹§ÁòÃýâßî ^ƾÙõ‹×óH#«LP½ïX=xÑÍ$|W?•~• îëÔ©ª‹ {ÝT…Kÿ ”hûâá)J*ö˜–ÔU;iÇ€/ ÆþjóZ\ýwØ=Ìm ºèËL9 ýèÆð/¨’¥öo=nË.%Îì ŽÕ¯È|{Oj²ƒE6e/ßdÄõ²Ìâ1O®ò×TsəԸhOMýíMˆ¿¼H˜l²,7Â¥#MF/Úf°Ö½± ¸–dr‹NýÊ íjqx{œÉ ä-È ¦ øÄër¨q°ð †nцýÑÄÆ’mä…n<0È™;ÁÝá¯ÁZƒ7FÀmì­ É&9ˆîéi¶ùN§Y• ÃZãAâ?•‡©‰ , ó¾IŸŠc1 4â&y­&pŠ­6;M À 0¹qç»p.á …ŸÅáK@%6·y6ƒ‰3?”úºŽ‰éX5ªPT §µ!=Mž«Ú½‹ÅgÂSâÉaþÓoö–¯ÁÔìR>5éÿ üs¶ÆUcÌ kÇR ]ÿ ù¬¼«VŽ;Â|‡~¢¦”ÏŰæ {L™Õ°Óv¹ò¸írޡעCÃ!íVÕ {¶»sŒNPg/ "uÕbkm²“$ďå¿é¹§°½æz¯6 †s¿!s–wÚÝ“™Œ °.ûj>·+™Òa…©Œ&rÝÎtÛë긪Ît’LAVp%c Úý[ÄzJ¾ÇàXXç@˜ó<êL]·T˜¾¥1Ó©V‡g´æ½¦Ý@¹óø!_@´ÞâSÁ —S3™•& ]@JHÚý©ZŽ €×æÔr»Áf!‡yÞ4Mv*èÓã_{‘åóUuљØ«Oïé*®EvÑ Œ÷‡U \"㪒ÍK+À 4“M¡ï:0¥5í!'<@î´”>Ç»&Z–ïCCV˜Ì5Šo&îhè.žû |ÓK©h$s6KìŒëã)¹hI¦GïOåóI;ììü#É$Š0…Ææ¥TØ.5­¾gn´ “ÂÖ\:hœ89G)J@„}œ:’Ò{/Š"¦_Æ×7Æ3VÇŠÊa]ÚŒÙ€Ä–=®uÁßâACZƒ§§£ Qnâ:«,×{tyø¬iÛcœÜÄ€H½ÄÍCk´÷šß .W'b¤Íåh]÷€=,Žv×cÚEÚHXJX¶îo¨FÒtèöŸ>ªª6[J®Fµ£sGÁeqõfe\íjÒÐïÄÐGˆe1Ø‹.Ø”‘Ëuø Y­ˆÜ ŽG|zùªüMpDnQWÄ”%JŠ™)â*p@Örš«ÕT2Ð%ˆG#ª„ ·¤!°ŸOTÂT¸aÚ%4&h™LµšØüÐ.F¿²ÐÞ_Ç‚¾ÅÃaÜ÷09Æ q€öy˜v‡85õN÷]¬äѼóS{°_MެúÔ#°Ç¸0åÞè2ëôPcvÆw9®ií1Ä8F™˜à‰´+‰Ik1òÝ7“Ñ×ÒsÝ\x‚h`ÞÑ`ó"|µEcý£n˜h`}GÞ !±ù²Ápü²ß6 0ïi󜵩SÈÇ7˜-ÕURO˜¦´f$ªž-Í6(œ}<„ éc øs]ŽŽ„*—¾ ìdŽ„)méª\¿êÎIg¾ØÞ~I#C/¼¼´EÁÈŽi8“©õådô·>euä ƒ'Ê×लR1ÉJE1ÐAát`t;ÇР%Ý<‡¥„ÍÆ`×Oyó)õiI€ñQaŸ4Ûù\áàaÃÔ¹HÃu¹*k€¦<„e S‡&õÏ B!ŽhüÞ`yj}mªf×\¿ Ç~æ­9‡û\՞Ǖg²1Žû5V7 !àöšm° c`ܬøÇìµÒ'P"?…´Ö,"§^•õލsÔ)6˜sæéÍR¼ ò|Sl”‹7 nPW Gòú÷½§O¯‡„l¡kSÞŒr½PÊ@æ¢pŽ-mÿ #Ÿ˜Àº¶Áä¦;ïÔæ$1££`“Õ>„—·ž)ßð³ñ#Ï Ô$¶œ‰ÊE‹À;÷º ¯«P:Ñ”8–IÊtpÞ3ª“>ê“þës4ò2OÏÕ­±zô†Õ§‰.÷ä¸;¿˜“'œ›žª}«Œ{ª±Ì 9ÔóÞÕ‡0 $íWV3Üì¬ —@kÝ4@¿r¼±½¬™›?øØæ´'Áé®CË3-g$˜ö‡×auÚi´Žp/êÛ æF›Ú2v‹ã¿¿,nB1̨ƃqÞa5͝@&Æû“él÷ \C²½UÍc ¯k×¢U ÖéQå™—-r wô ÞÏ<Ò=&=ÿ Ôê Òêˈt,i—;LîÜ á¸*ÚÃ1$êL•LÍ <É)ýÐà’ ;F™{ƒ™˜€&'}‚ãÄK`¡ÞT@I;®žZóè‚s’7®°›+§O­Åq©é»²9<Ô J ¼9O’HL»Ùïì¸rk¼Ž_ý‘TŸu[²ßÚŒ·ü÷B%¯E ŸÔX5êO´ Ç•€’I0 ÉJX` ñ¹õ%;µŸD‘«´€àwÒ™U ûئžÖö\×®×´8 ½‡ºÐÆÓ§?Àkmœ=;d5*@-ì0F Rªýš[Ü6âö̃ڸr*KA9· u*µæ£?U¸Âêí†8@¦X4 e-ò„0s{ HâUpU?¼mñRa°®a%Ð'tÉ×’\¾ÊÉ]t›h>·(Ë@R¼¡Ãt h}’O÷au<+nT…Ö…MӐ??Óe95 q>í/;&JSû °¯ÊéÞ øƒ*Ã2½Ài&:nôUl=¾¿5eˆ3”ñc|Ú2V”>„»&eE;«ÚäC p¢Û úy 9š[ŒÌx¼擼A&DåÒ¯ˆ¤ÀÌ;"˜ ÏQä¸åhÊ}Ûq«Û0WžÒ|»€ø®öCm5•\ÇÀ§Pe3£]0ÃàLDÉ‰1øªxjgwT‚÷¿LΨK‹›ùs—xˆÜ±µ kæ¸f‰‰ÜGk/LÛØ6d9ò¶ùA{ƒA3š/¬D¬khÓk‰`˜"㯒r¿±Óã jx‡°e}<Ñø\3y:'À•/h½Í€Ç4~g ?Û(¼]v‘ªlKÎâ~?O‚W%{Ì:“'©úNq¾›úo(X’¥¯ˆ nFê{Ç€ü?º'ë ø‹ì Þ09ŒÌç9Æ —ËC`j@ÓÄ(+a‹un¸#ÂꟋ{K`‘ÑÍÍ'à´»/Û,KW;Þ4²þð ï Nm|~fGÏ(…³Ã)«1ö­Õ ¥‡¨©ƒÃ™ü-s=à=U66Ï«Ýc蓦W¹íž®›nÔ%êÇìŒ<#Ü×84ån®Ð ÒåOC` ñânÑs‡¢ç 1õ%Îhì½Ã½® e:ݼUZo™`  ÅZŸŒÊ«ê1ÏÄo$q¹Þ€©ˆhÐÉä¯ñ[!…Ú˜àJ:x2$Íß&PåT£6ç— ‡Í*4Ýšçjÿ ‰É nófÐ ó(L5C•åÆ\rMÒ@ò }y-W}™üýVù—ú¢=Ù”c®‘< M ž ´Phr ¦©TD ‘ù.$´÷O‡‘V2Æò.=IUŒ=ž‡â¬i™aþÓåÙ?òUø'ØÖ•.~* šTŒ!•-×áºTâ®ä#õü'´ eýlYÅÓeÕKÂrT"CÚ@u!Óxƒ{š3€}1¿(r}%«nËamjÑ%ÑNEò v ˜à  σöK³,*º.àzù¨™Ó ÚçâU¦*¿ 9{%Ö¹ njûdaXöb) kÛÆ±ûÓ\°M7ˆÂ=û›ç¿Ã‚­V»Cg–8ÙêE- j)k$º`Ã-ùEýeBÆÇ]c¡°ñty&Òd0nõ'¡W+ƒ*|–øµFa\GQªEAÔp5\Ǽ·¼Ç8·õ -â§Ú[ ‡ uZeÖ 3}×d'+¹:ð+K†Û®s!Ï$úe€<Û”x)1»a­¡LC]¸µík…ÚàA»AYº{†ªS[¦5HÒ7ù --,ísòDØ€èk ÞÀîÜ ò@â( ËNˆë›4ô½•/¦o‡€Û7 ê•ÆêòðÜy'Án½µ á˜ݦ ndeo…[ì¶Ê,¥R³Ä=À±—–ß;£™´ñSâ*g§”ïaið‘Jå~™ÓÞ ß³Õ¢»8x埒²52>AÊb&-÷\7´éÄù€T˜,w;3{ï˜k…à¹ÄqÀ«œ{€\ ˆ¾[´¨јr &Úé„Ívˆ±8†¿]|¬ņ4I×pÞS1ÈÖz‰#Ìv‡G!YNògñ:màTz¢Ý1ô©^O=~ë|5Bã™ç•¼µõ•bÆ@úÕS¬ÈŒ#¬zünrŸ û” Z²•èðV"ÁHÚý©wÝ €7¼Ìu1hÑa3Éä û f$o¿É ™Ú›ÝçnpÒ3äÌ3†Í§,Äï]$‰/pê †«À¼¸e9­Æê_C]žƒ·ý·frÁN«, E=›Çq -‰öŒ:aÏ¿±í&£Í:-} 84‘ÿ eƒQÑeëSsuiA ³g㟥ú£?ÿ ʼn*”“÷aühe:ÊWa@ÒÞk±eØ] F Ô—r.åä˜ @ö¥ªZoÐýYL·¥S²G/‡ñ <~*ZÆ´è>JlòàÛÆ½ÿ 窘ìGN¢:I®KšJp/`íIÁÀõ#Ä-€ö­šµŒoF4|ÆQØÆ@Ì|£Ô…¢À{9˜è½Üó›€ôYÒÎYsið;ís¤€à²ˆ‚4qÉVŒI$ ‰"° æµ8cXGjœˏ¡Aâý•ËÜ¢ûï e·çLx']á"oÅÎê3¯Ç—¹”ó0nå‚âg{Œñ> S´˜îè°g238‚ãköÝfÚd´6Ò€;ò÷±¢™¼›º ¢Æ'¥Ðx'e¬ç ]bÈÆV¢ó‹kýBO ðÊâ$Ÿ!×T 3Mýמ žìٍàÌü‘8÷€àæØ8æ©6‰©L´«…oãpð„~Çk‰!ñ;‹”ÛžÍ àž±z Ÿôû øŸÝužÏ;ÿ #|u6™Þ¬ÚˆÐõA4¶â|ôl|Ê2ŽÇ¤ÝÅÇY.<#Aí.k§hóF‚”Y; M½Ö4hŸ4&›­¿tès´%FìL¥£Ãk‰ÇT¤haÁ¤ÚxfÉ`ÑìË›>i 3t‚:,–+^÷´–{Û–Nxi"x‘Ûg î¨>¥Õ܁ùZH,2Û“:8xÊ¢Çí9.É-Ìâã-=çjwµS˜dütžçwýGòú®®ûº_ˆýx$–¡ãøO EÚÛÏ÷R„×w+3£Á£öUMyR²¹âŒ°š›¸Ñãò9§Ó_Dl+Ùßc›úšGÅÌc†Ž!Ko=¶.‘Îÿ c²(2®V mª.ÿ ¹B›¹å ù„öŸSV>™ü¯$y:G¢Z×àøúdî¹û­·ýÇ´:•c LÍõi_‹ö+ÎæGÊè>OŠ•äž´§Þ{X}¨1ÚTc›»Qþ•êô°t¿OP?eæ~É{5]•ÙR£r5†nZ\ã@ &îJõ ¾àC°þV>fé¥/ü5ñÊIº_é5 ;e­h<@ Ä&æÃëE%;X,ÒãÆÞ`Oò¦kŸm#˜!ÀyÄ¢| óLšò¥Ä` ¶R=|ÈCâh5ò3DˆïF†ðÒ#ÅìÛœ?¸yhBãœí ZxßÎÄhºRK„`Þödvײ™ÀÈÑÒgŒuY w³%†ƒÓzõ ÖÏp‚dH®¦A´ù§»ÓÇMæ~)ˆð‡û:ù&Ä •vGD´À n ݇¼Ö8Fö óáà£~Ë¥x`oK|Ä?fxiØü%pìR>éò+Û±éÎ>núlFŤ'tq8LZÏvÃ?„¡ß±È⽆¯³íü@x|PöUäèØã¡ð‚ŒAìÏ"vÍwóŸÍ{ ý0.z È•Ö{,N¡£¡ŸKÕÙž>Ýœþ ÍÀ°<×EA!Å‚D™IúOÍ¡>ôG}Â` ÍßkÜL™Ž Þð™ {IøF²¹òQ3&!ÃÂÞz.d&Ï-sH¸,Ôõ˜ŽP€ 77ˆÝ¼ÊëÜw =cÕ Ú,ØÐ5ÎYÐ)ì´öœgŒ[¤ßv㙑8心>h]§µháYš£²ºÑ.{Ï7Sð•?´~×SÃKýJÛ˜ ™Íäiúu<µX¶1õ^kâçIÑ£sZ4h>j*ÔšD:4­¿_ ÷¸ Õxæÿ ¸?Mù _•­ÊÐ ä ÷ý ÑwL œ­ïnTkÛUÍN©ë:¦fV ¶ÜÔÜMªÅâA½–¿R×TXš-%iTÊT•‡Ù‚JôϐZxWÑè‰f‰òG º ×Õû2aZ7OU3[“×AT–ÞŒ…-‘¤”Ì ì&(ˆ¿­•ƒkï’:ðY¦W‘ Å)“†‘˜³Åtcø˜ñTÂwÚÇ4|üLÇªí–v- qˆèU qPE.†â‘˜µ Æ,ÐÅs]8¾„oúÑ i>ÜxxÈó)ƒ ´æÁâØ$À‰vžŸf$Ž |ãw;ÀÁIJ»b` {¦Ó¤Ú$©YÀ‘n@Óïž«9J¼êG m¤ ܯ¹ÌW4€ÐÒÅÛ‡#褕Ÿn-?í|с¥÷Ú¹¬'´ÞÜ9ÓK `hê£SÄSà?7—Wí_´…óB›»:=Ãïq`<8ñÓŒÑlú2d¬ê³£hÖ[l|$vÝro~'R®‰§°ñmY ͧäP |PUª¹·:3Œ[Û{Xÿ ºâ@‚W–Äé u‚ ¯´*=íή.pûÒdt @G‰¬ s¸ ëÉücr ÞæÑ¨Ê@>¤¢Ö±. Þ'¯°ÌME[YéïĵÂCå½ Ué©Áû'Ê9%eÔðNU”ë‘ÌsD3/®+UI˜9h.WC”빓$#:pz:YÓ ¿xž* ³$Í +$kñAŠ‹†¢ Uê>¸)_š¬÷©ßAÂÔb9ÇU ¯¾á•9¯ÏÏ÷O÷¼¼Fähal1‰3Ì[Ïr•´UCksNÐ] R‘¸¥H+§Šé†c©vÖÞ0iÓ76s†î!§=ß ¼~Ô'°Ãmäoäš³ªøi1úÉ)³yV8 CLÄØÁ‘WYïi€H6ÖÑiámø^ÈY´°Ñ7¥Û*—Ñ©L«Qƒï—Ùrÿ ›£Ð*š¸ˆL©ˆ$ˆ ÷¾D§9È®«qbqC)–ˆïv´çñsÑVT­Ø, <àïºÀO«Jý·õ àfPìð .wFšir´þ’2_Y *Æ€x\« ì€9š@ Ž|F⇥ˆkZ@hÖÄ0t¿-<“‹qµ¾*ZL¤Ú)&BJpÓF5=$„at*Zš$’ÑtdûÝRI1 2މ$€$I$#‰SÞ’Hë¬ï;Á$¡t$’`<(ñÇt)$‡Ð.Êf¢X’Kt=Éé$‚ˆªè¢oÝëòI%Rgcª÷ŠyI%¡‰ÿ !ñ)´õ $¤ Ô’IIGÿÙ '😯', * ':(' => '🙁', * ':)' => '🙂', * ':?' => '😕', * ) ); * * true === $smilies->contains( ':)' ); * false === $smilies->contains( 'simile' ); * * '😕' === $smilies->read_token( 'Not sure :?.', 9, $length_of_smily_syntax ); * 2 === $length_of_smily_syntax; * * ## Precomputing the Token Map. * * Creating the class involves some work sorting and organizing the tokens and their * replacement values. In order to skip this, it's possible for the class to export * its state and be used as actual PHP source code. * * Example: * * // Export with four spaces as the indent, only for the sake of this docblock. * // The default indent is a tab character. * $indent = ' '; * echo $smilies->precomputed_php_source_table( $indent ); * * // Output, to be pasted into a PHP source file: * WP_Token_Map::from_precomputed_table( * array( * "storage_version" => "6.6.0", * "key_length" => 2, * "groups" => "", * "long_words" => array(), * "small_words" => "8O\x00:)\x00:(\x00:?\x00", * "small_mappings" => array( "😯", "🙂", "🙁", "😕" ) * ) * ); * * ## Large vs. small words. * * This class uses a short prefix called the "key" to optimize lookup of its tokens. * This means that some tokens may be shorter than or equal in length to that key. * Those words that are longer than the key are called "large" while those shorter * than or equal to the key length are called "small." * * This separation of large and small words is incidental to the way this class * optimizes lookup, and should be considered an internal implementation detail * of the class. It may still be important to be aware of it, however. * * ## Determining Key Length. * * The choice of the size of the key length should be based on the data being stored in * the token map. It should divide the data as evenly as possible, but should not create * so many groups that a large fraction of the groups only contain a single token. * * For the HTML5 named character references, a key length of 2 was found to provide a * sufficient spread and should be a good default for relatively large sets of tokens. * * However, for some data sets this might be too long. For example, a list of smilies * may be too small for a key length of 2. Perhaps 1 would be more appropriate. It's * best to experiment and determine empirically which values are appropriate. * * ## Generate Pre-Computed Source Code. * * Since the `WP_Token_Map` is designed for relatively static lookups, it can be * advantageous to precompute the values and instantiate a table that has already * sorted and grouped the tokens and built the lookup strings. * * This can be done with `WP_Token_Map::precomputed_php_source_table()`. * * Note that if there is a leading character that all tokens need, such as `&` for * HTML named character references, it can be beneficial to exclude this from the * token map. Instead, find occurrences of the leading character and then use the * token map to see if the following characters complete the token. * * Example: * * $map = WP_Token_Map::from_array( array( 'simple_smile:' => '🙂', 'sob:' => '😭', 'soba:' => '🍜' ) ); * echo $map->precomputed_php_source_table(); * // Output * WP_Token_Map::from_precomputed_table( * array( * "storage_version" => "6.6.0", * "key_length" => 2, * "groups" => "si\x00so\x00", * "long_words" => array( * // simple_smile:[🙂]. * "\x0bmple_smile:\x04🙂", * // soba:[🍜] sob:[😭]. * "\x03ba:\x04🍜\x02b:\x04😭", * ), * "short_words" => "", * "short_mappings" => array() * } * ); * * This precomputed value can be stored directly in source code and will skip the * startup cost of generating the lookup strings. See `$html5_named_character_entities`. * * Note that any updates to the precomputed format should update the storage version * constant. It would also be best to provide an update function to take older known * versions and upgrade them in place when loading into `from_precomputed_table()`. * * ## Future Direction. * * It may be viable to dynamically increase the length limits such that there's no need to impose them. * The limit appears because of the packing structure, which indicates how many bytes each segment of * text in the lookup tables spans. If, however, care were taken to track the longest word length, then * the packing structure could change its representation to allow for that. Each additional byte storing * length, however, increases the memory overhead and lookup runtime. * * An alternative approach could be to borrow the UTF-8 variable-length encoding and store lengths of less * than 127 as a single byte with the high bit unset, storing longer lengths as the combination of * continuation bytes. * * Since it has not been shown during the development of this class that longer strings are required, this * update is deferred until such a need is clear. * * @since 6.6.0 */ class WP_Token_Map { /** * Denotes the version of the code which produces pre-computed source tables. * * This version will be used not only to verify pre-computed data, but also * to upgrade pre-computed data from older versions. Choosing a name that * corresponds to the WordPress release will help people identify where an * old copy of data came from. */ const STORAGE_VERSION = '6.6.0-trunk'; /** * Maximum length for each key and each transformed value in the table (in bytes). * * @since 6.6.0 */ const MAX_LENGTH = 256; /** * How many bytes of each key are used to form a group key for lookup. * This also determines whether a word is considered short or long. * * @since 6.6.0 * * @var int */ private $key_length = 2; /** * Stores an optimized form of the word set, where words are grouped * by a prefix of the `$key_length` and then collapsed into a string. * * In each group, the keys and lookups form a packed data structure. * The keys in the string are stripped of their "group key," which is * the prefix of length `$this->key_length` shared by all of the items * in the group. Each word in the string is prefixed by a single byte * whose raw unsigned integer value represents how many bytes follow. * * ┌────────────────┬───────────────┬─────────────────┬────────┐ * │ Length of rest │ Rest of key │ Length of value │ Value │ * │ of key (bytes) │ │ (bytes) │ │ * ├────────────────┼───────────────┼─────────────────┼────────┤ * │ 0x08 │ nterDot; │ 0x02 │ · │ * └────────────────┴───────────────┴─────────────────┴────────┘ * * In this example, the key `CenterDot;` has a group key `Ce`, leaving * eight bytes for the rest of the key, `nterDot;`, and two bytes for * the transformed value `·` (or U+B7 or "\xC2\xB7"). * * Example: * * // Stores array( 'CenterDot;' => '·', 'Cedilla;' => '¸' ). * $groups = "Ce\x00"; * $large_words = array( "\x08nterDot;\x02·\x06dilla;\x02¸" ) * * The prefixes appear in the `$groups` string, each followed by a null * byte. This makes for quick lookup of where in the group string the key * is found, and then a simple division converts that offset into the index * in the `$large_words` array where the group string is to be found. * * This lookup data structure is designed to optimize cache locality and * minimize indirect memory reads when matching strings in the set. * * @since 6.6.0 * * @var array */ private $large_words = array(); /** * Stores the group keys for sequential string lookup. * * The offset into this string where the group key appears corresponds with the index * into the group array where the rest of the group string appears. This is an optimization * to improve cache locality while searching and minimize indirect memory accesses. * * @since 6.6.0 * * @var string */ private $groups = ''; /** * Stores an optimized row of small words, where every entry is * `$this->key_size + 1` bytes long and zero-extended. * * This packing allows for direct lookup of a short word followed * by the null byte, if extended to `$this->key_size + 1`. * * Example: * * // Stores array( 'GT', 'LT', 'gt', 'lt' ). * "GT\x00LT\x00gt\x00lt\x00" * * @since 6.6.0 * * @var string */ private $small_words = ''; /** * Replacements for the small words, in the same order they appear. * * With the position of a small word it's possible to index the translation * directly, as its position in the `$small_words` string corresponds to * the index of the replacement in the `$small_mapping` array. * * Example: * * array( '>', '<', '>', '<' ) * * @since 6.6.0 * * @var string[] */ private $small_mappings = array(); /** * Create a token map using an associative array of key/value pairs as the input. * * Example: * * $smilies = WP_Token_Map::from_array( array( * '8O' => '😯', * ':(' => '🙁', * ':)' => '🙂', * ':?' => '😕', * ) ); * * @since 6.6.0 * * @param array $mappings The keys transform into the values, both are strings. * @param int $key_length Determines the group key length. Leave at the default value * of 2 unless there's an empirical reason to change it. * * @return WP_Token_Map|null Token map, unless unable to create it. */ public static function from_array( array $mappings, int $key_length = 2 ): ?WP_Token_Map { $map = new WP_Token_Map(); $map->key_length = $key_length; // Start by grouping words. $groups = array(); $shorts = array(); foreach ( $mappings as $word => $mapping ) { if ( self::MAX_LENGTH <= strlen( $word ) || self::MAX_LENGTH <= strlen( $mapping ) ) { _doing_it_wrong( __METHOD__, sprintf( /* translators: 1: maximum byte length (a count) */ __( 'Token Map tokens and substitutions must all be shorter than %1$d bytes.' ), self::MAX_LENGTH ), '6.6.0' ); return null; } $length = strlen( $word ); if ( $key_length >= $length ) { $shorts[] = $word; } else { $group = substr( $word, 0, $key_length ); if ( ! isset( $groups[ $group ] ) ) { $groups[ $group ] = array(); } $groups[ $group ][] = array( substr( $word, $key_length ), $mapping ); } } /* * Sort the words to ensure that no smaller substring of a match masks the full match. * For example, `Cap` should not match before `CapitalDifferentialD`. */ usort( $shorts, 'WP_Token_Map::longest_first_then_alphabetical' ); foreach ( $groups as $group_key => $group ) { usort( $groups[ $group_key ], static function ( array $a, array $b ): int { return self::longest_first_then_alphabetical( $a[0], $b[0] ); } ); } // Finally construct the optimized lookups. foreach ( $shorts as $word ) { $map->small_words .= str_pad( $word, $key_length + 1, "\x00", STR_PAD_RIGHT ); $map->small_mappings[] = $mappings[ $word ]; } $group_keys = array_keys( $groups ); sort( $group_keys ); foreach ( $group_keys as $group ) { $map->groups .= "{$group}\x00"; $group_string = ''; foreach ( $groups[ $group ] as $group_word ) { list( $word, $mapping ) = $group_word; $word_length = pack( 'C', strlen( $word ) ); $mapping_length = pack( 'C', strlen( $mapping ) ); $group_string .= "{$word_length}{$word}{$mapping_length}{$mapping}"; } $map->large_words[] = $group_string; } return $map; } /** * Creates a token map from a pre-computed table. * This skips the initialization cost of generating the table. * * This function should only be used to load data created with * WP_Token_Map::precomputed_php_source_tag(). * * @since 6.6.0 * * @param array $state { * Stores pre-computed state for directly loading into a Token Map. * * @type string $storage_version Which version of the code produced this state. * @type int $key_length Group key length. * @type string $groups Group lookup index. * @type array $large_words Large word groups and packed strings. * @type string $small_words Small words packed string. * @type array $small_mappings Small word mappings. * } * * @return WP_Token_Map Map with precomputed data loaded. */ public static function from_precomputed_table( $state ): ?WP_Token_Map { $has_necessary_state = isset( $state['storage_version'], $state['key_length'], $state['groups'], $state['large_words'], $state['small_words'], $state['small_mappings'] ); if ( ! $has_necessary_state ) { _doing_it_wrong( __METHOD__, __( 'Missing required inputs to pre-computed WP_Token_Map.' ), '6.6.0' ); return null; } if ( self::STORAGE_VERSION !== $state['storage_version'] ) { _doing_it_wrong( __METHOD__, /* translators: 1: version string, 2: version string. */ sprintf( __( 'Loaded version \'%1$s\' incompatible with expected version \'%2$s\'.' ), $state['storage_version'], self::STORAGE_VERSION ), '6.6.0' ); return null; } $map = new WP_Token_Map(); $map->key_length = $state['key_length']; $map->groups = $state['groups']; $map->large_words = $state['large_words']; $map->small_words = $state['small_words']; $map->small_mappings = $state['small_mappings']; return $map; } /** * Indicates if a given word is a lookup key in the map. * * Example: * * true === $smilies->contains( ':)' ); * false === $smilies->contains( 'simile' ); * * @since 6.6.0 * * @param string $word Determine if this word is a lookup key in the map. * @param string $case_sensitivity Optional. Pass 'ascii-case-insensitive' to ignore ASCII case when matching. Default 'case-sensitive'. * @return bool Whether there's an entry for the given word in the map. */ public function contains( string $word, string $case_sensitivity = 'case-sensitive' ): bool { $ignore_case = 'ascii-case-insensitive' === $case_sensitivity; if ( $this->key_length >= strlen( $word ) ) { if ( 0 === strlen( $this->small_words ) ) { return false; } $term = str_pad( $word, $this->key_length + 1, "\x00", STR_PAD_RIGHT ); $word_at = $ignore_case ? stripos( $this->small_words, $term ) : strpos( $this->small_words, $term ); if ( false === $word_at ) { return false; } return true; } $group_key = substr( $word, 0, $this->key_length ); $group_at = $ignore_case ? stripos( $this->groups, $group_key ) : strpos( $this->groups, $group_key ); if ( false === $group_at ) { return false; } $group = $this->large_words[ $group_at / ( $this->key_length + 1 ) ]; $group_length = strlen( $group ); $slug = substr( $word, $this->key_length ); $length = strlen( $slug ); $at = 0; while ( $at < $group_length ) { $token_length = unpack( 'C', $group[ $at++ ] )[1]; $token_at = $at; $at += $token_length; $mapping_length = unpack( 'C', $group[ $at++ ] )[1]; $mapping_at = $at; if ( $token_length === $length && 0 === substr_compare( $group, $slug, $token_at, $token_length, $ignore_case ) ) { return true; } $at = $mapping_at + $mapping_length; } return false; } /** * If the text starting at a given offset is a lookup key in the map, * return the corresponding transformation from the map, else `false`. * * This function returns the translated string, but accepts an optional * parameter `$matched_token_byte_length`, which communicates how many * bytes long the lookup key was, if it found one. This can be used to * advance a cursor in calling code if a lookup key was found. * * Example: * * false === $smilies->read_token( 'Not sure :?.', 0, $token_byte_length ); * '😕' === $smilies->read_token( 'Not sure :?.', 9, $token_byte_length ); * 2 === $token_byte_length; * * Example: * * while ( $at < strlen( $input ) ) { * $next_at = strpos( $input, ':', $at ); * if ( false === $next_at ) { * break; * } * * $smily = $smilies->read_token( $input, $next_at, $token_byte_length ); * if ( false === $next_at ) { * ++$at; * continue; * } * * $prefix = substr( $input, $at, $next_at - $at ); * $at += $token_byte_length; * $output .= "{$prefix}{$smily}"; * } * * @since 6.6.0 * * @param string $text String in which to search for a lookup key. * @param int $offset Optional. How many bytes into the string where the lookup key ought to start. Default 0. * @param int|null &$matched_token_byte_length Optional. Holds byte-length of found token matched, otherwise not set. Default null. * @param string $case_sensitivity Optional. Pass 'ascii-case-insensitive' to ignore ASCII case when matching. Default 'case-sensitive'. * * @return string|null Mapped value of lookup key if found, otherwise `null`. */ public function read_token( string $text, int $offset = 0, &$matched_token_byte_length = null, $case_sensitivity = 'case-sensitive' ): ?string { $ignore_case = 'ascii-case-insensitive' === $case_sensitivity; $text_length = strlen( $text ); // Search for a long word first, if the text is long enough, and if that fails, a short one. if ( $text_length > $this->key_length ) { $group_key = substr( $text, $offset, $this->key_length ); $group_at = $ignore_case ? stripos( $this->groups, $group_key ) : strpos( $this->groups, $group_key ); if ( false === $group_at ) { // Perhaps a short word then. return strlen( $this->small_words ) > 0 ? $this->read_small_token( $text, $offset, $matched_token_byte_length, $case_sensitivity ) : null; } $group = $this->large_words[ $group_at / ( $this->key_length + 1 ) ]; $group_length = strlen( $group ); $at = 0; while ( $at < $group_length ) { $token_length = unpack( 'C', $group[ $at++ ] )[1]; $token = substr( $group, $at, $token_length ); $at += $token_length; $mapping_length = unpack( 'C', $group[ $at++ ] )[1]; $mapping_at = $at; if ( 0 === substr_compare( $text, $token, $offset + $this->key_length, $token_length, $ignore_case ) ) { $matched_token_byte_length = $this->key_length + $token_length; return substr( $group, $mapping_at, $mapping_length ); } $at = $mapping_at + $mapping_length; } } // Perhaps a short word then. return strlen( $this->small_words ) > 0 ? $this->read_small_token( $text, $offset, $matched_token_byte_length, $case_sensitivity ) : null; } /** * Finds a match for a short word at the index. * * @since 6.6.0 * * @param string $text String in which to search for a lookup key. * @param int $offset Optional. How many bytes into the string where the lookup key ought to start. Default 0. * @param int|null &$matched_token_byte_length Optional. Holds byte-length of found lookup key if matched, otherwise not set. Default null. * @param string $case_sensitivity Optional. Pass 'ascii-case-insensitive' to ignore ASCII case when matching. Default 'case-sensitive'. * * @return string|null Mapped value of lookup key if found, otherwise `null`. */ private function read_small_token( string $text, int $offset = 0, &$matched_token_byte_length = null, $case_sensitivity = 'case-sensitive' ): ?string { $ignore_case = 'ascii-case-insensitive' === $case_sensitivity; $small_length = strlen( $this->small_words ); $search_text = substr( $text, $offset, $this->key_length ); if ( $ignore_case ) { $search_text = strtoupper( $search_text ); } $starting_char = $search_text[0]; $at = 0; while ( $at < $small_length ) { if ( $starting_char !== $this->small_words[ $at ] && ( ! $ignore_case || strtoupper( $this->small_words[ $at ] ) !== $starting_char ) ) { $at += $this->key_length + 1; continue; } for ( $adjust = 1; $adjust < $this->key_length; $adjust++ ) { if ( "\x00" === $this->small_words[ $at + $adjust ] ) { $matched_token_byte_length = $adjust; return $this->small_mappings[ $at / ( $this->key_length + 1 ) ]; } if ( $search_text[ $adjust ] !== $this->small_words[ $at + $adjust ] && ( ! $ignore_case || strtoupper( $this->small_words[ $at + $adjust ] !== $search_text[ $adjust ] ) ) ) { $at += $this->key_length + 1; continue 2; } } $matched_token_byte_length = $adjust; return $this->small_mappings[ $at / ( $this->key_length + 1 ) ]; } return null; } /** * Exports the token map into an associate array of key/value pairs. * * Example: * * $smilies->to_array() === array( * '8O' => '😯', * ':(' => '🙁', * ':)' => '🙂', * ':?' => '😕', * ); * * @return array The lookup key/substitution values as an associate array. */ public function to_array(): array { $tokens = array(); $at = 0; $small_mapping = 0; $small_length = strlen( $this->small_words ); while ( $at < $small_length ) { $key = rtrim( substr( $this->small_words, $at, $this->key_length + 1 ), "\x00" ); $value = $this->small_mappings[ $small_mapping++ ]; $tokens[ $key ] = $value; $at += $this->key_length + 1; } foreach ( $this->large_words as $index => $group ) { $prefix = substr( $this->groups, $index * ( $this->key_length + 1 ), 2 ); $group_length = strlen( $group ); $at = 0; while ( $at < $group_length ) { $length = unpack( 'C', $group[ $at++ ] )[1]; $key = $prefix . substr( $group, $at, $length ); $at += $length; $length = unpack( 'C', $group[ $at++ ] )[1]; $value = substr( $group, $at, $length ); $tokens[ $key ] = $value; $at += $length; } } return $tokens; } /** * Export the token map for quick loading in PHP source code. * * This function has a specific purpose, to make loading of static token maps fast. * It's used to ensure that the HTML character reference lookups add a minimal cost * to initializing the PHP process. * * Example: * * echo $smilies->precomputed_php_source_table(); * * // Output. * WP_Token_Map::from_precomputed_table( * array( * "storage_version" => "6.6.0", * "key_length" => 2, * "groups" => "", * "long_words" => array(), * "small_words" => "8O\x00:)\x00:(\x00:?\x00", * "small_mappings" => array( "😯", "🙂", "🙁", "😕" ) * ) * ); * * @since 6.6.0 * * @param string $indent Optional. Use this string for indentation, or rely on the default horizontal tab character. Default "\t". * @return string Value which can be pasted into a PHP source file for quick loading of table. */ public function precomputed_php_source_table( string $indent = "\t" ): string { $i1 = $indent; $i2 = $i1 . $indent; $i3 = $i2 . $indent; $class_version = self::STORAGE_VERSION; $output = self::class . "::from_precomputed_table(\n"; $output .= "{$i1}array(\n"; $output .= "{$i2}\"storage_version\" => \"{$class_version}\",\n"; $output .= "{$i2}\"key_length\" => {$this->key_length},\n"; $group_line = str_replace( "\x00", "\\x00", $this->groups ); $output .= "{$i2}\"groups\" => \"{$group_line}\",\n"; $output .= "{$i2}\"large_words\" => array(\n"; $prefixes = explode( "\x00", $this->groups ); foreach ( $prefixes as $index => $prefix ) { if ( '' === $prefix ) { break; } $group = $this->large_words[ $index ]; $group_length = strlen( $group ); $comment_line = "{$i3}//"; $data_line = "{$i3}\""; $at = 0; while ( $at < $group_length ) { $token_length = unpack( 'C', $group[ $at++ ] )[1]; $token = substr( $group, $at, $token_length ); $at += $token_length; $mapping_length = unpack( 'C', $group[ $at++ ] )[1]; $mapping = substr( $group, $at, $mapping_length ); $at += $mapping_length; $token_digits = str_pad( dechex( $token_length ), 2, '0', STR_PAD_LEFT ); $mapping_digits = str_pad( dechex( $mapping_length ), 2, '0', STR_PAD_LEFT ); $mapping = preg_replace_callback( "~[\\x00-\\x1f\\x22\\x5c]~", static function ( $match_result ) { switch ( $match_result[0] ) { case '"': return '\\"'; case '\\': return '\\\\'; default: $hex = dechex( ord( $match_result[0] ) ); return "\\x{$hex}"; } }, $mapping ); $comment_line .= " {$prefix}{$token}[{$mapping}]"; $data_line .= "\\x{$token_digits}{$token}\\x{$mapping_digits}{$mapping}"; } $comment_line .= ".\n"; $data_line .= "\",\n"; $output .= $comment_line; $output .= $data_line; } $output .= "{$i2}),\n"; $small_words = array(); $small_length = strlen( $this->small_words ); $at = 0; while ( $at < $small_length ) { $small_words[] = substr( $this->small_words, $at, $this->key_length + 1 ); $at += $this->key_length + 1; } $small_text = str_replace( "\x00", '\x00', implode( '', $small_words ) ); $output .= "{$i2}\"small_words\" => \"{$small_text}\",\n"; $output .= "{$i2}\"small_mappings\" => array(\n"; foreach ( $this->small_mappings as $mapping ) { $output .= "{$i3}\"{$mapping}\",\n"; } $output .= "{$i2})\n"; $output .= "{$i1})\n"; $output .= ')'; return $output; } /** * Compares two strings, returning the longest, or whichever * is first alphabetically if they are the same length. * * This is an important sort when building the token map because * it should not form a match on a substring of a longer potential * match. For example, it should not detect `Cap` when matching * against the string `CapitalDifferentialD`. * * @since 6.6.0 * * @param string $a First string to compare. * @param string $b Second string to compare. * @return int -1 or lower if `$a` is less than `$b`; 1 or greater if `$a` is greater than `$b`, and 0 if they are equal. */ private static function longest_first_then_alphabetical( string $a, string $b ): int { if ( $a === $b ) { return 0; } $length_a = strlen( $a ); $length_b = strlen( $b ); // Longer strings are less-than for comparison's sake. if ( $length_a !== $length_b ) { return $length_b - $length_a; } return strcmp( $a, $b ); } }