SSL Labs ScoreSecurityHeaders.io Score

The Joomla ecosystem is sick. The JED is corrupt. Joomla leadership has turned to censorship.

I've been writing Joomla extensions for more than 10 years. Much of the time, I get paid to write custom extensions and when my a customer chooses a pricing model that leaves me the rights to the extension, I release for free or as a paid extension (about 10% choose that pricing option). I have over 50 extensions in the JED. My entire business revolves around Joomla and writing custom extensions.

I often release extensions free, or as paid with a free (less functional) version. This gives me the opportunity to get my name out there and gives potential customers the opportunity to experience my extensions, which often leads to contract work writing custom extensions.

Joomla has, on multiple occasions, incorporated my ideas directly into the Joomla core. Here are a few examples:

Extension RicheyWeb Release Added to Joomla More Functionality?
System - Content Security Policy 2018-04-27 4.0 on 2018-06-20 less
System - Clean Response 2010-05-29 3.1.4 on 2013-06-27 less
DomainRestriction 2011-05-17 3.9 on 2018-08-08 less
Fields - Subform 2017-12-08 3.10-dev (pending) less
EU e-Privacy Directive 2012-06-11 3.9 on 2018-08-27 less
There are more, I just need to find them

In some cases, I've commented in the Joomla GitHub pull requests explaining that they're implementing functionality that is currently available in 3rd party extensions in the JED - but recently, those comments are being deleted. I can't even defend myself against this because Joomla leadership just deletes it. https://github.com/joomla/joomla-cms/issues/11905

I don't remember how I replied, but I'm sure it contained a link to my subform field in the JED. This is the pepperstreet message deleted by Joomla:

I'm not opposed to Joomla gaining new features, but when the features are ripped from my extensions and implemented incompletely - I'm left to wonder why I should ever release another free extension! Contributing to the Joomla core is something I've also done on several occasions. I would be honored if I was asked to create a core implementation for some of my free extensions - but that isn't happening. The functionality ends up being poorly or incompletely implemented and then I'm left to explain to my customers why they should pay for mine when they can get it from Joomla for free.

When confronted, Joomla leadership shows they have no interest in preserving the work of 3rd party developers.

What's really amusing is that I'm the person who brought the capability for Joomla to accept repeating custom form field values. https://github.com/joomla/joomla-cms/pull/19025 They're using capabilities that I gave them to step on my extension.

I did it first - and I get no credit. It's infuriating.

Discuss this article in the forums (0 replies).

Independence Day is important to me, and many Americans. I thought I'd write something patriotic and non-programming related today.

Recently, Time Magazine published the now infamous "Welcome to America" cover. Featuring an illegal immigrant child crying in front of a very tall Donald Trump who was looking down on her, this was a very politically charged cover. The message of the cover was that Donald Trump was separating families at the border, but there was a problem. The little girl they used for the photo was never separated from her mother. Time Magazine was then self labeled "Fake News".

When I saw the Time special edition on "Founding Fathers", I was already prejudiced against the issue. Time had gone SJW, and the popular thing these days is to call the founding fathers terrorists. I expected the edition to maintain the current liberal talking points and historical revisions that are popular in liberal academia. My wife actually saw it first, and her reaction was "I wonder how bad they're going to paint the founders." Having a deep interest for US history, I said "Put it in the cart - I'll tear it apart (figuratively) later."

The expectation was that the magazine would paint all of the founders with the same brush that is popular these days. They're racists, they're slave owners, they're terrorists, they killed the native Americans... I don't deny that maybe some of them were racists (Thomas Jefferson was very much against, and worked to end slavery), and some were definitely slave owners. I'm sure that the British considered them terrorists, and killing indigenous populations was how colonization was done (it's not like the native Americans weren't in a state of constant war anyway). These things are known, and at the time it was the way of the world.

This magazine surprised me. Not only is it well made, but it appears to have been well researched. I've read a few sections now, and it doesn't appear to have been written with any agenda other than to provide information and context - I don't see a bias in what I've read so far. Much of what I've read is information I already knew, but there is also new information (or, new to me). I'm actually relieved that I don't have to go through this magazine with a fine toothed comb to find the inaccuracies. I was expecting to write something very different.

It's sad that a once respected publication like Time has soiled its name to the point that the first reaction to an edition like this is to expect bias. There used to be a shame associated with inaccurate reporting and "fake news" - but it's become the standard operating procedure for many news outlets and publications. Maybe the red cover taught Time a lesson.

Happy 4th of July!

Discuss this article in the forums (0 replies).
Discuss this article in the forums (3 replies).

I'm a reasonable guy. I spend a lot of time freely supporting extensions that I also give away for free. A few of my extensions are paid, and I support them as well. For users who buy my extensions, I give them a bit more attention - because they paid for it. I'm even willing to go the extra mile to make them happy, because customers remember that sort of thing and they come back for it. Even if asking for a refund, if the customer has a reason (usually, any reason will do) - I'll issue the refund. This is partly because it's not worth the time spent arguing over a few dollars, and partly because I don't want to be unreasonable....it takes too much energy.

My attitude changes radically when I'm threatened or given an ultimatum. I do not take kindly to threats. It's also not in my nature to accept false accusations.

When Jeff Hecht of clikzdigital.com contacted me about his Nomad Pro purchase, he started his message with an accusation of misrepresentation. Don't take my word for it, here's the email:

Almost immediately, I received another message (a reply to the automated paid invoice email he received) - this time accusing me of fraud and threatening a negative review in the JED:

I don't take this sort of thing lightly. I bend over backwards, 60-70 hours a week, to make sure my users and customers are happy. When someone makes a threat against my integrity and reputation, I don't take it sitting down. I agreed to refund his purchase, but first I asked him to point out where I stated that Nomad was a "login redirection" plugin. Knowing well that nowhere in my Nomad documentation, nor the Nomad download page, nor the JED page for either the free (last updated August 07, 2017) or the paid extension (last updated August 23, 2017) say it was a login redirection plugin - and in fact they all very clearly state (as the first sentence of the description) "Nomad is not login redirection, it's homepage redirection!!!". The purpose was to suggest that he had made a mistake with the claim that I advertised Nomad as a login redirection plugin. My response:

All of this is easily proven by looking at the pages themselves, and at the Internet Archive. The first sentence of the plugin page on the JED has remained the same since 2014!
Nomad 2014: https://web.archive.org/web/20141228065735/https://extensions.joomla.org/extensions/extension/access-a-security/site-access/system-nomad/

Because his accusations arrived within an hour of his purchase, I gave him a little time to respond. Then I received this notice from the JED:

So now I get to deal with lies posted on the JED by someone who either didn't bother to read ANYTHING about the extension he purchased and is trying to bully his way into a refund - or is trying to scam his way into a free copy. Either way, it didn't need to go this way. Naming and shaming isn't something I want to do, but it seems like Jeff Hecht insists on being a liar, so my only defense is truth.

I'm not an unreasonable person. If Mr. Hecht had responded reasonably (or at all), I would not be doing this. A simple "my mistake, I purchased this thinking it was login redirection" would have worked for me. Instead, I'm here...doing this. Just be reasonable and don't threaten to post lies about me. There's a reason I post my phone number, and other contact information - because I've never cheated anyone. I have done nothing to run away or hide from.

Mr. Hecht got his refund. But it's going to cost him some reputation.

Discuss this article in the forums (2 replies).

For one of my customers, I provide an LMS (Learning Management System) which I built myself. It's a neat system that tracks users progress through annual training. I won't get into much detail because it's an exclusive service I provide to only this one customer. Something I will divulge (because it's necessary to establish a context to this post) is that it supports SCORM.

Something I enjoyed very much was writing a SCORM Javascript/PHP interface. It's one of those love/hate things. I'm really proud of my JS/PHP interface, but I really hate SCORM.... Specifically, I hate being required to overcome the limitations built into SCORM, like the CORS/Cross Domain limitation...that was tough to overcome...it took a few days to think my way around that one. I might even release the code for that (cross domain SCORM hosting)

Adding insult to injury, the modules I'm working with are created by Articulate Storyline. The reason this makes it worse, is that Articulate has decided to compress/encrypt the data being stored in the LMS. There is a segment of very useful data that's blocked off behind some proprietary storage format. In their forums, they've remarked about how useful the data would be....right before they say "make a feature request".....which they've been doing for at least 4 years while their customers beg for access to the data.

"How is it useful?", you might ask. Well, it contains things like slides viewed, the answers to questions present in the module, time spent....really useful stuff for anyone running an LMS. Personally, I use it to display progress to the users watching the modules (by counting unique slides viewed and dividing by the total number of slides I can present the percentage of completion)....this isn't any proprietary information they're storing, it's really mundane data. In fact, the compression/encryption they've employed isn't a particularly good compression. They could probably reduce the complexity of their code by just storing it in JSON format....like everyone else does.

I only bring up Articulate because of a new requirement from my customer - xAPI/TinCan support. Like it's predecessor, it also has a data storage requirement and like it's predecessor - Articulate has compressed/encrypted the data.....big surprise.

I asked my customer for a large xAPI module to harvest some data from, and harvest I have done. 115k worth of state data destined for a server that wasn't paying any attention to it (because I haven't written the LRS code yet). I want to know what's in this data first - so when I deploy xAPI next to SCORM for my customer, the transition will be seamless. The code I wrote to decrypt/decompress the SCORM data will have a relative doing the same thing to xAPI state data. The code I wrote to recover data from the SCORM modules isn't working for xAPI state data, although the data looks very similar. My goal is to refine the process and find the true method to recover the viewed slide data.

Here's a very small sample of the data I'll be obsessing over for the next few days. The state data ignored by my test server for the first 11 slides.

2M146070ji1001112a0101201112~201r100000000000000000000000000v_player.6RcLlMaxzl8.5yCw4XHlO1J1^1^0000000000000000000
2T16607080on1001212f010120111201212~201r300000000000000000000000000v_player.6RcLlMaxzl8.663onvqKo3V1^1^0000000000000000000
2_1860708090ts1001312k01012011120121201312~201r700000000000000000000000000v_player.6RcLlMaxzl8.6ccmrL9ioUo1^1^0000000000000000000
252a60708090a0yx1001412p0101201112012120131201412~201rf00000000000000000000000000v_player.6RcLlMaxzl8.5vbbOBcgbsm1^1^0000000000000000000
2c2c60708090a0b0DC1001512u010120111201212013120141201512~201rv00000000000000000000000000v_player.6RcLlMaxzl8.5rwUxY42BNH1^1^0000000000000000000
2j2e60708090a0b0c0IH1001612z01012011120121201312014120151201612~201r$00000000000000000000000000v_player.6RcLlMaxzl8.5rykj01k7mc1^1^0000000000000000000
2q2g60708090a0b0c0d0NM1001712E0101201112012120131201412015120161201712~201r$10000000000000000000000000v_player.6RcLlMaxzl8.6onHE6Mpydo1^1^0000000000000000000
2x2i60708090a0b0c0d0e0SR1001812J010120111201212013120141201512016120171201812~201r$30000000000000000000000000v_player.6RcLlMaxzl8.5oFwUyK7Xt21^1^0000000000000000000
2E2k60708090a0b0c0d0e0f0XW1001912O01012011120121201312014120151201612017120181201912~201r$70000000000000000000000000v_player.6RcLlMaxzl8.5XHqRZf2nMZ1^1^0000000000000000000
2O2m60708090a0b0c0d0e0f0g0~201$1001a12T0101201112012120131201412015120161201712018120191201a12~201r$f0000000000000000000000000v_player.6RcLlMaxzl8.6RCS0OVapMt1^1^0000000000000000000
2Y2o60708090a0b0c0d0e0f0g0h0~281~2411001b12Y0101201112012120131201412015120161201712018120191201a1201b12~201r$v0000000000000000000000000v_player.6RcLlMaxzl8.5wzXzefGwxp1^1^0000000000000000000

I've already identified a few different patterns in this data, so this won't take long.

Looking a little closer, the patterns change a bit with a larger dataset. I believe the data I'm looking for starts with the 1st "100" and ends with "~201r", so I'm just going to show that section from the last state stored for the module. Something I notice is that the first and last piece of data match - for this line, "1c1g" is found immediately after the start marker and immediately before the end marker. I believe that's the last viewed slide after looking at a section of data where I pressed "Next" then "Prev" then "Next" again. I haven't identified the next piece of data, but it's always followed by a "010" which I believe to be another kind of start marker.

Here's where it gets interesting. I don't think the delimiter is 1201, but 120...and I also don't think it's 120 - but another type that increments - possibly giving value to the numbers that follow. Whatever it is, it's 3 digits, because the numbers following it form a pattern - 11,12,13,14,15,16,17,18,19,1a,1b,1c. If converted from hex, they're 17,18,19.... It doesn't make sense to start with 11 though - and where the module transitions from 1.13 to 2.1 - the delimiter changes to 130 and there's an anomalous 10 (16 in decimal). The first set starts with 11, and all of the rest of the sets start with 10. I also know it's not hex - because the numbers extend beyond f - but I haven't discovered an upper limit yet - so I'll keep looking at it as hex+ until I establish an upper limit (which may require I ask someone to build me a module with a section holding greater than....62? slides so I can find out what happens when 0-9 a-z and A-Z are exhausted).

1001c1g~2Gc0101201112012120131201412015120161201712018120191201a1201b1201c120101301113012130131301413015130161301713018130191301a1301b1301c1301d1301e1301f1301g1301h1301i1301j1301k130101401114012140131401414015140161401714018140101501115012150131501415015150161501715018150191501a1501b1501c1501d1501e15010160111601216013160141601516016160171601017011170121701317014170151701018011180121801318014180151801618017180101901119012190131901419015190101a0111a0121a0131a0141a0151a0161a0171a0181a0191a01a1a0101b0111b0121b0131b0141b0151b0161b0171b0181b0191b01a1b01b1b01c1b01d1b01e1b01f1b01g1b01h1b01i1b01j1b01k1b01l1b01m1b01n1b01o1b01p1b01q1b01r1b01s1b01t1b0101c0111c0121c0131c0141c0151c0161c0101d0111d0121d0101e0111e0101f0111f0121f0131f0141f0151f0161f0171f0181f0191f0101g0111g0121g0131g0141g0151g0161g0171g0181g0191g01a1g01b1g01c1g~201r

I think I got it wrong again...but the more I look at this, the more it makes sense. The start markers haven't changed except that 2nd 010 delimiter...that's actually data! The first section does start with a "10" like the rest...I just didn't see it at first. I believe the actual delimiter is 0 and each record is 4 characters - 2 pair of hex(ish) digits. The progression of numbers suggests the page is first, and the section is second. That makes the first entry (1012 or 10/12) translate to 16,18 - followed by 17,18 and 18,18.

1001c1g~2Gc0101201112012120131201412015120161201712018120191201a1201b1201c120101301113012130131301413015130161301713018130191301a1301b1301c1301d1301e1301f1301g1301h1301i1301j1301k130101401114012140131401414015140161401714018140101501115012150131501415015150161501715018150191501a1501b1501c1501d1501e15010160111601216013160141601516016160171601017011170121701317014170151701018011180121801318014180151801618017180101901119012190131901419015190101a0111a0121a0131a0141a0151a0161a0171a0181a0191a01a1a0101b0111b0121b0131b0141b0151b0161b0171b0181b0191b01a1b01b1b01c1b01d1b01e1b01f1b01g1b01h1b01i1b01j1b01k1b01l1b01m1b01n1b01o1b01p1b01q1b01r1b01s1b01t1b0101c0111c0121c0131c0141c0151c0161c0101d0111d0121d0101e0111e0101f0111f0121f0131f0141f0151f0161f0171f0181f0191f0101g0111g0121g0131g0141g0151g0161g0171g0181g0191g01a1g01b1g01c1g~201r

I have a feeling that the first piece of data after the last viewed slide is a counter - time in module maybe. The first entry is a0 and the last slide is ~2Gc. I need to figure out what base this number is. Obviously, it contains ~ alpha upper and lower as well as numeric. If I assume it's base 63 (0-9a-zA-Z~) - the first data stored (a0) is equal to 630 milliseconds...and that doesn't sound out of the realm of possibility. time to click next...maybe time to load the first slide... The last value ~2Gc is equal to 258.5585 minutes.....if my assumption is correct. I'll have to test this theory.

a0,f0,k0,p0.....~2Gc - it increments at regular intervals a to f = 5, f to k = 5, k to p = 5 - so it has nothing to do with time. After the last entry above (Y0) it jumps to ~21 - so I think it's a flag - but what are the 4 digits between Y and 2? I think it's safe to assume there's a Z and a 1 in there - Z??12. Time spent would be nice to gather, but this isn't it, and doesn't seem to have anything to do with the data I'm interested in (slides viewed) - so I'll ignore it for now.....back to the task at hand - slides.

At this point, I can work up some regex to isolate the data I'm interested in. This should get me to the start of the dataset: 100(.*?)[0-9a-zA-Z]{4}[~]{0,1}[0-9a-zA-Z]{1,}?(?=010) - and the end of the set is a constant ~201r with a dataset in between starting with 0....easy.

preg_match('/100(.*?)[0-9a-zA-Z]{4}[~]{0,1}[0-9a-zA-Z]{1,}?(?=010)(?P<data>(.*?))~201r/',$data,$matches);
print_r($matches);
Array
(
[0] => 1001112a0101201112~201r
[1] =>
[data] => 0101201112
[2] => 0101201112
[3] => 0101201112
)

Now I can access just the data I want to work with in $matches['data'];

Because the data is delimited with 0 and also contains 0 within some elements - I can't use explode. There are other ways.

preg_match_all('/0(?=[0-9a-zA-Z])(?P<item>[0-9a-zA-Z]{4})/',$matches['data'],$items);
print_r($items);
Array
(
[0] => Array
(
[0] => 01012
[1] => 01112
)

[item] => Array
(
[0] => 1012
[1] => 1112
)

[1] => Array
(
[0] => 1012
[1] => 1112
)

)

And now I have an array of viewed slides. It may contain duplicates, so if you need a count and not breadcrumbs you can use array_unique to clean it up. Here's the function I'm using in my system.

function slides($data) {
 preg_match('/100(.*?)[0-9a-zA-Z]{4}[~]{0,1}[0-9a-zA-Z]{1,}?(?=010)(?P<data>(.*?))~201r/',$data,$matches);
 preg_match_all('/0(?=[0-9a-zA-Z])(?P<item>[0-9a-zA-Z]{4})/',$matches['data'],$items);
 return $items['item'];
}

By the way - this should just as easily decode Articulate SCORM suspend_data to retrieve the slide count....or whatever you modify the function to retrieve. Happy trails!

Update - more data has become available. My customer agreed to send me a test module with 65 slides within one section. This should show me the base numbering system used in the stored items. I'll update when I have an accurate picture of that data.

This new data is interesting. 65 slides proved to not be enough - so it was bumped to 85 slides and that revealed the base number system is 64 digits (but it's not standard base64). As regex [0-9a-zA-Z_$] - for some reason I thought it would be bigger - but I'll take what I can get. Something else is interesting about this data - the ~201r end marker is apparently not an ending marker. I'll have to review each of the state submissions to find where it disappears. The page/section digits are not always 4 digits. When the page rolls over at $, it becomes 3 digits making the page/section 5 digits...so I'll need to adjust my regex statements to accommodate the possibility of changing lengths. Knowing the entire charset of the numbering system will allow me to decode everything, but I'll have to come up with a system to manage the decoded numbers in order to accommodate a change in the length of digits. Here's what an 85 slide section data looks like. I'm leaving this uncolored because that takes a long time by hand (as I said, I haven't rewritten the regex to cut this up automatically yet).

2Ha~2G260708090a0b0c0d0e0f0g0h0i0j0k0l0m0n0o0p0q0r0s0t0u0v0w0x0y0z0A0B0C0D0E0F0G0H0I0J0K0L0M0N0O0P0Q0R0S0T0U0V0W0X0Y0Z0_0$001112131415161718191a1b1c1d1e1f1g1h1i1j1k1l1m1n1o1p1q1~2e7~2a71002k112~2_60101201112012120131201412015120161201712018120191201a1201b1201c1201d1201e1201f1201g1201h1201i1201j1201k1201l1201m1201n1201o1201p1201q1201r1201s1201t1201u1201v1201w1201x1201y1201z1201A1201B1201C1201D1201E1201F1201G1201H1201I1201J1201K1201L1201M1201N1201O1201P1201Q1201R1201S1201T1201U1201V1201W1201X1201Y1201Z1201_1201$1202011202111202211202311202411202511202611202711202811202911202a11202b11202c11202d11202e11202f11202g11202h11202i11202j11202k112B0v_player.6crDQVV0N7p.5dnr7ivoCLN1^1^00000

This is going to become a set of functions to not only decode and list, but also translate.

OK, I've had some time to play with the data and I've discovered that the ~201r ending delimiter is not always present. It seems to appear when you have a certain amount of data, so I can't rely on it being present. After adjusting the regex, I can get a reliable result on both datasets.

function slides($data) {
 $matches = array(); // to make my IDE happy
 $items = array(); // to make my IDE happy
 $dataregex = '/100(?P<end>[0-9a-zA-Z_$]{4,5}(?=[0-9a-zA-Z_$~]+010))(.*?)(?=010)(?P<data>(.*?)\k<end>)/';
 $itemregex = '/0(?=[0-9a-zA-Z_$])(?P<item>[0-9a-zA-Z_$]{4,}?(?=(0|$)))/';
 preg_match($dataregex, $data, $matches);
 preg_match_all($itemregex, $matches['data'], $items);
 return $items['item'];
}

$slides = slides($data);
print_r($slides);
Array
(
[0] => 1012
[1] => 1112
[2] => 1212
[3] => 1312
[4] => 1412
[5] => 1512
[6] => 1612
[7] => 1712
[8] => 1812
[9] => 1912
[10] => 1a12
[11] => 1b12
[12] => 1c12
[13] => 1d12
[14] => 1e12
[15] => 1f12
[16] => 1g12
[17] => 1h12
[18] => 1i12
[19] => 1j12
[20] => 1k12
[21] => 1l12
[22] => 1m12
[23] => 1n12
[24] => 1o12
[25] => 1p12
[26] => 1q12
[27] => 1r12
[28] => 1s12
[29] => 1t12
[30] => 1u12
[31] => 1v12
[32] => 1w12
[33] => 1x12
[34] => 1y12
[35] => 1z12
[36] => 1A12
[37] => 1B12
[38] => 1C12
[39] => 1D12
[40] => 1E12
[41] => 1F12
[42] => 1G12
[43] => 1H12
[44] => 1I12
[45] => 1J12
[46] => 1K12
[47] => 1L12
[48] => 1M12
[49] => 1N12
[50] => 1O12
[51] => 1P12
[52] => 1Q12
[53] => 1R12
[54] => 1S12
[55] => 1T12
[56] => 1U12
[57] => 1V12
[58] => 1W12
[59] => 1X12
[60] => 1Y12
[61] => 1Z12
[62] => 1_12
[63] => 1$12
[64] => 20112
[65] => 21112
[66] => 22112
[67] => 23112
[68] => 24112
[69] => 25112
[70] => 26112
[71] => 27112
[72] => 28112
[73] => 29112
[74] => 2a112
[75] => 2b112
[76] => 2c112
[77] => 2d112
[78] => 2e112
[79] => 2f112
[80] => 2g112
[81] => 2h112
[82] => 2i112
[83] => 2j112
[84] => 2k112
)

Now that I have the character set for the base numbering system, I'll look into writing a function(s) to decode it into something useful. The only problem I foresee is the increasing numbers of digits without a delimiter. I'll need to track the previous section id during the conversion. My solution will require the use of my AnyBase PHP class, available at GitHub here: https://github.com/stutteringp0et/AnyBase

Update: The most satisfying thing about doing this kind of work is the great feeling of gaining understanding over something unknown. Even when I prove myself and my previous assumptions wrong, it's very satisfying to make progress. I mention this because my previous statement about there being no delimiter in the >4 digit page/section was entirely wrong. Upon using the now known base number system, I found that there is indeed a delimiter and the page/section remains 4 digits. When the page exceeds 64, a delimiter of "1" is added between the page/section. Articulate claims this is encoded to manage the size of the data - but this additional delimiter is completely unnecessary and only adds to the size of the output. A 2 digit section using their version of 64 character base number system tops out at 4031 pages and sections (each) before the need to increase either to 3 digits.

I've almost finished the translator for the slide array. This has been really fun.

function translateSlides($slides) {
 $a64 = new AnyBase('0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_$');
 $r = array();
 foreach($slides as $slide) {
  switch(strlen($slide)) {
   case 4:
    list($page64,$section64) = str_split($slide,2);
    break;
   default: // anything else
    $section64 = substr($slide,-2);
    $page64 = substr($slide,0,2);
    break;
  }
  $r[$slide] = array('section'=>($a64->decode($section64)-65),'page'=>($a64->decode($page64)-63));
 }
 return $r;
}
$translated = translateSlides($slides);
print_r($translated);
Array
(
[1012] => Array
(
[section] => 1
[page] => 1
)

[1112] => Array
(
[section] => 1
[page] => 2
)

[1212] => Array
(
[section] => 1
[page] => 3
)

[1312] => Array
(
[section] => 1
[page] => 4
)

[1412] => Array
(
[section] => 1
[page] => 5
)

[1512] => Array
(
[section] => 1
[page] => 6
)

[1612] => Array
(
[section] => 1
[page] => 7
)

[1712] => Array
(
[section] => 1
[page] => 8
)

[1812] => Array
(
[section] => 1
[page] => 9
)

[1912] => Array
(
[section] => 1
[page] => 10
)

[1a12] => Array
(
[section] => 1
[page] => 11
)

[1b12] => Array
(
[section] => 1
[page] => 12
)

[1c12] => Array
(
[section] => 1
[page] => 13
)

[1d12] => Array
(
[section] => 1
[page] => 14
)

[1e12] => Array
(
[section] => 1
[page] => 15
)

[1f12] => Array
(
[section] => 1
[page] => 16
)

[1g12] => Array
(
[section] => 1
[page] => 17
)

[1h12] => Array
(
[section] => 1
[page] => 18
)

[1i12] => Array
(
[section] => 1
[page] => 19
)

[1j12] => Array
(
[section] => 1
[page] => 20
)

[1k12] => Array
(
[section] => 1
[page] => 21
)

[1l12] => Array
(
[section] => 1
[page] => 22
)

[1m12] => Array
(
[section] => 1
[page] => 23
)

[1n12] => Array
(
[section] => 1
[page] => 24
)

[1o12] => Array
(
[section] => 1
[page] => 25
)

[1p12] => Array
(
[section] => 1
[page] => 26
)

[1q12] => Array
(
[section] => 1
[page] => 27
)

[1r12] => Array
(
[section] => 1
[page] => 28
)

[1s12] => Array
(
[section] => 1
[page] => 29
)

[1t12] => Array
(
[section] => 1
[page] => 30
)

[1u12] => Array
(
[section] => 1
[page] => 31
)

[1v12] => Array
(
[section] => 1
[page] => 32
)

[1w12] => Array
(
[section] => 1
[page] => 33
)

[1x12] => Array
(
[section] => 1
[page] => 34
)

[1y12] => Array
(
[section] => 1
[page] => 35
)

[1z12] => Array
(
[section] => 1
[page] => 36
)

[1A12] => Array
(
[section] => 1
[page] => 37
)

[1B12] => Array
(
[section] => 1
[page] => 38
)

[1C12] => Array
(
[section] => 1
[page] => 39
)

[1D12] => Array
(
[section] => 1
[page] => 40
)

[1E12] => Array
(
[section] => 1
[page] => 41
)

[1F12] => Array
(
[section] => 1
[page] => 42
)

[1G12] => Array
(
[section] => 1
[page] => 43
)

[1H12] => Array
(
[section] => 1
[page] => 44
)

[1I12] => Array
(
[section] => 1
[page] => 45
)

[1J12] => Array
(
[section] => 1
[page] => 46
)

[1K12] => Array
(
[section] => 1
[page] => 47
)

[1L12] => Array
(
[section] => 1
[page] => 48
)

[1M12] => Array
(
[section] => 1
[page] => 49
)

[1N12] => Array
(
[section] => 1
[page] => 50
)

[1O12] => Array
(
[section] => 1
[page] => 51
)

[1P12] => Array
(
[section] => 1
[page] => 52
)

[1Q12] => Array
(
[section] => 1
[page] => 53
)

[1R12] => Array
(
[section] => 1
[page] => 54
)

[1S12] => Array
(
[section] => 1
[page] => 55
)

[1T12] => Array
(
[section] => 1
[page] => 56
)

[1U12] => Array
(
[section] => 1
[page] => 57
)

[1V12] => Array
(
[section] => 1
[page] => 58
)

[1W12] => Array
(
[section] => 1
[page] => 59
)

[1X12] => Array
(
[section] => 1
[page] => 60
)

[1Y12] => Array
(
[section] => 1
[page] => 61
)

[1Z12] => Array
(
[section] => 1
[page] => 62
)

[1_12] => Array
(
[section] => 1
[page] => 63
)

[1$12] => Array
(
[section] => 1
[page] => 64
)

[20112] => Array
(
[section] => 1
[page] => 65
)

[21112] => Array
(
[section] => 1
[page] => 66
)

[22112] => Array
(
[section] => 1
[page] => 67
)

[23112] => Array
(
[section] => 1
[page] => 68
)

[24112] => Array
(
[section] => 1
[page] => 69
)

[25112] => Array
(
[section] => 1
[page] => 70
)

[26112] => Array
(
[section] => 1
[page] => 71
)

[27112] => Array
(
[section] => 1
[page] => 72
)

[28112] => Array
(
[section] => 1
[page] => 73
)

[29112] => Array
(
[section] => 1
[page] => 74
)

[2a112] => Array
(
[section] => 1
[page] => 75
)

[2b112] => Array
(
[section] => 1
[page] => 76
)

[2c112] => Array
(
[section] => 1
[page] => 77
)

[2d112] => Array
(
[section] => 1
[page] => 78
)

[2e112] => Array
(
[section] => 1
[page] => 79
)

[2f112] => Array
(
[section] => 1
[page] => 80
)

[2g112] => Array
(
[section] => 1
[page] => 81
)

[2h112] => Array
(
[section] => 1
[page] => 82
)

[2i112] => Array
(
[section] => 1
[page] => 83
)

[2j112] => Array
(
[section] => 1
[page] => 84
)

[2k112] => Array
(
[section] => 1
[page] => 85
)
)

To wrap it all up in a nice package, here's an abstract class:

 
/*  @copyright Copyright (C) 2013 - 2018 Michael Richey. All rights reserved.
 *  @license GNU General Public License version 3 or later
 */
 
abstract class decodeStateData {
 
 public static function slides($data) {
  $matches = array();
  $items = array();
  $dataregex = '/100(?P<end>[0-9a-zA-Z_$]{4,5}(?=[0-9a-zA-Z_$~]+010))(.*?)(?=010)(?P<data>(.*?)\k<end>)/';
  $itemregex = '/0(?=[0-9a-zA-Z_$])(?P<item>[0-9a-zA-Z_$]{4,}?(?=(0|$)))/';
  preg_match($dataregex, $data, $matches);
  preg_match_all($itemregex, $matches['data'], $items);
  return $items['item'];
 }
 
 public static function translateSlides($slides) {
  require_once('anybase.php');
  $a64 = new AnyBase('0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_$');
  $r = array();
  foreach($slides as $slide) {
   switch(strlen($slide)) {
    case 4:
     list($page64,$section64) = str_split($slide,2);
     break;
    default: // anything else
     $section64 = substr($slide,-2);
     $page64 = substr($slide,0,2);
     break;
   }
   $r[$slide] = array('section'=>($a64->decode($section64)-65),'page'=>($a64->decode($page64)-63));
  }
  return $r;
 }
}

I hope this helps you get past the overprotective nonsense. This encoding scheme isn't anything particularly awesome, and definitely not something worth protecting as fiercely as it's being protected. People have been asking for a way to read the data for 4+ years and Articulate has continued to deny that request. This set of functions will serve to grant access to this data, and possibly convince Articulate to abandon their death grip on our data and just store it in a standardized format.

My functions are written in PHP, because that's what my LMS is written in - but I could easily convert this to other languages if needed....that won't be free though ;)

Final thoughts:

There are still some encoded segments that I haven't identified yet. Some make my scratch my head and ask "why?" (like alphabet that seems to mirror the progression of sections...what's the purpose of that?) There are several pieces of data that increment at a regular pace (some increment 7 per update, some 5 per update, some 3 per update). Still other data seems to change with no regularity at all....or perhaps it's too complex for me to identify with my monkey brain.

Did I mentioned that I write and host custom software for most of my customers? What can I do for you?

Update 10/4/2018 - I've rewritten the Regular Expressions to accommodate old and new data storage methods. This covers xAPI as well as the older SCORM output from various versions of Storyline.

I noticed that the beginning of the data section contained the last bit of data, sort of a preamble that identified where the end of the data was. Using that information I was able to construct a new Regex statement that identifies the ending data element and looks for it later in the expression.

2M146070ji1001112a0101201112~201r100000000000000000000000000v_player.6RcLlMaxzl8.5yCw4XHlO1J1^1^0000000000000000000
Discuss this article in the forums (0 replies).