-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assignment of UUID #767
Comments
This is a good catch - whilst most APs don't send statements to multiple LRSs, some might and may not realize the particular important of generating the statement id. In section 2.4.1 we already say We should also add some wording explaining the rationale for using the same id in both LRSs, perhaps also explaining why the id is a UUID at the same time. |
I don't believe that what I described is an edge case nor do I believe that exhorting activity providers to be good citizens will solve it - especially if the xAPI becomes widely used. The problem I believe stems from the current "definition" of the Activity Provider which is not a definition but a casual description which does not identify the key functions of this component in the xAPI ecosystem. I suggest that the definition of this component be changed from:
to something like:
I don't think that the learning record store component should be responsible for generating uuids. That's not to say that the current LRS out there should not be be generating uuids - as many probably contain the statement generation function - but that's a discussion for another thread... |
Why would the AP be sending to the regional and/or national LRS, why wouldn't the local LRS do it, at which point the id has already been assigned? I think the fact that you can say "I see this as problematic." means that so will the implementer on that side, IOW they'll think "hey I have to already have the id to make this match up, how do I achieve that?". Ensuring an AP can talk to multiple LRS would be more difficult. Making the AP assign an id puts burden on the AP which we've always tried to avoid, and having the LRS have that burden seems reasonable, and an id is certainly a requirement. I'm 👎 on the change in general, this is a technical spec, people should read it that way. On the AP description, certainly could change, but I think there are issues with the suggested definition, not the least of which is there are APIs other than the statement ones. An AP is just a system that communicates with an LRS. |
What does IOW mean? On Mon, Oct 12, 2015 at 1:14 PM, Brian J. Miller [email protected]
Roger Swetnam |
In Other Words (IOW) -a-
|
@brianjmiller I'm afraid this is the first technical spec that I've commented on so any suggestions you have on how I should be reading it or am misreading it would be greatly appreciated. Also, I was unaware that you've always tried putting the burden on the AP to generate unique identifiers. What's the rationale for this? If you could help me here, I think I would be in a better position to reply. |
I am concerned that the proposed solution for multiple unique identifiers for the same learning experiences relies on Activity Providers to not only be good citizens but smart citizens. Consider the following use case. xApi has become very popular and recruitment agencies are asking job applicants to post their CVs to their Learning Record Stores. I've just graduated from Blogs University and have just figured out that I can create a table of my learning experiences create a little program that will allow me to post my collections of learning statements to multiple recruitment agencies by simply inputting their address and pressing a button. Because I didn't attend the class on unique identifiers, I don't associate uuids with my statements - but no problem - all 50 of them will accept my postings and generate those uuids for me. |
There are probably enough things in the spec that "could cause problems" to overflow a 32-bit unsigned integer. But LRS-assigned UUIDS are a positive boon because:
What @rswetnam is describing is not an AP->LRS problem, it's an LRS->LRS problem; if you need to move statements between LRSs, take the whole statement including UUID as the spec (at least seemingly) mandates. Don't generate a new one just because you're moving data around, that defeats the purpose of having UUIDs in the first place. |
@canweriotnow, this makes a lot of sense to me:
The spec is vague in that regard. At the time this language was first put in, LRS-to-LRS communication wasn’t something yet practiced and (2.4) as it stands could be interpreted both ways. If in practice LRSs aren’t currently respecting the original UUID in LRS-to-LRS communication, I can see this clarification as something I’d like to see in a 2.0 because this could certainly be a breaking change for some. |
Each AP is it’s own hub of activity in an emergent network of APs. To localize the use of the AP’s generated data, an AP has to have access to the UUIDs for the statements it creates. |
@aaronesilvers So, point by point:
I guess it frustrates me that we're treading in the footsteps of the IETF who wrote RFCs expecting that people would implement TCP or HTTP intelligently and clean up their own messes if they screwed up, and then fall in these morasses which more closely resemble CSS, where the standards designer assumed he was designing for idiots, and we had to invent things like LESS and SASS/SCSS to get around the idiotic idiot-proofing. I'd rather design for the intelligent case. P.S. Not calling anyone here an idiot, I just have been through too many standards processes and seen successes and cock-ups and would like to stay on the success side (which is where I believe we are headed). |
@aaronesilvers Also, consider that not all APs are created equal; I know you're aware, but many of us forget, xAPI is not just another iteration of SCORM; it goes so far beyond trite LMS applications... a thousand beacons or RFID readers could be reporting to a single embedded system (Rasp Pi, Android device, etc.), that is actually capable of constructing statements and making an HTTPS connection to an LRS... or maybe a simpler system that assembles data, sends bytes over TCP to a system capable of HTTPS communication... what, then is the AP? The RFID reader? The intermediary, slightly smarter chip? The system that can actually make an HTTPS POST and get the UUIDs back? We're not building for the same world that a lot of the old SCORM folks are thinking of... and if we target xAPI to that we're shooting ourselves in the face, repeatedly, with a mortar. We're used to a world of pure signal, test scores and the like... I suggest that anyone interested in the future of xAPI read Claude Shannon before proceeding. (Also, I use HTTPS here b/c if you're sending data to an LRS over HTTP, I'm going to MiTM attack that out of sheer principle. It'd be unethical to do otherwise.) |
@rswetnam I'll reply to you directly since you addressed me but I think the others have captured well the thoughts. I think you are reading it fine, I just wanted to caution against adding too much to a technical spec that wasn't purely technical, which is to say testable as well. I think your thoughts and commentary are worthwhile, even just in an issue like this where they can be searched, are discoverable, the history can be captured, etc. I consider that working really well, without having to add non-technical language to describe things to the document itself. To me (read: opinion) it is enough to say APs can generate ids, LRSs must, and that they are UUID, leave the use cases, best practices, and reasons as to why for issues, blog posts, tweets, etc. I think you reversed something, we try to put the burden on the LRS (avoid doing so on the AP). The reason to put the burden on the LRS rather than the AP is because we are hoping there are a lot of them (millions perhaps), but there are very likely to remain few implementations of LRSs (hundred? hundreds? right now I think we are at about 10-15) and probably at least one order of magnitude fewer installations. |
In suggesting that APs be required to include unique identifiers with statements posted to Learning Record Stores, I am making the following assumptions:
I am more than open to the possibility that any of my assumptions are wrong and would be interested to learn which and why. |
I think that is the point. Personally I'd like to see us use the first sentence in the original definition of an AP, because that is the only thing that one can really say about it. It is specifically a system communicating with an LRS. All of the things that aren't using the model and APIs defined in the xAPI spec are not APs and they are talking to systems that are not LRSs. They may well be doing things correctly and efficiently, they just aren't using xAPI even if that data is eventually translated (the key word here) into xAPI. And yes an LRS can itself by an AP as it may be a system communicating with an LRS (either internally to itself) or with another LRS. I think @aaronesilvers comment that "At the time this language was first put in, LRS-to-LRS communication wasn’t something yet practiced" isn't correct. LRS to LRS communication was known to be needed and had been explored at the earliest conceptions of the spec and is the reason for a number of the properties of statements, such as timestamp vs stored, the authority, etc. |
@rswetnam that's why it is a "SHOULD" requirement, it is considered a best practice and they should do so. But making it a "MUST" may well break some use cases and at this point would certainly be backwards incompatible.
But that isn't a problem for the spec, that is a problem for the AP. If it makes that AP less capable then the market will handle that.
May be a problem for the system, that system may also know how to handle that problem wisely. Something being a technical challenge through poor authoring isn't something that the spec has to correct for. I'll agree, I don't see generation of UUIDs as a burden, but I suspect that is also why it is included as a "SHOULD" instead of being left out completely. Having said that, random number generation can be taxing on a system, and as @canweriotnow has pointed out the types of systems likely to be generating statement like data could be very low powered and may not need to know the UUIDs of its statements, so unnecessarily burdening them to solve a non-problem doesn't seem necessary. As far as an AP sending a statement to multiple LRSs, I disagree that they need independent identifiers, there isn't anything indicating an AP can't send the same statement to multiple LRSs. The LRS is expected to be able to handle receiving the same statement multiple times, how to do so is explicitly called out in the specification, and covers the case where it may come from multiple different systems (such as the originating AP vs. another LRS). |
@brianjmiller to try and be clear here, it’s not a dig. Yes, we absolutely knew LRS-to-LRS communication was needed and had been explored, but you have to admit there’s a difference between knowing, modeling and prototyping and that communication being realized with multiple LRSs developed by different teams. My point being, yes, of course, conceiving of the need for LRS-to-LRS communication influenced several properties — and still there are nuances that may be recognized only when it’s in the wild and we realize that despite our best efforts, without the aid of conformance requirements spelling out what are seemingly minute details, that implementation may vary. |
|
👍 @brianjmiller and @aaronesilvers |
I accept that there is no appetite for requiring APs to include unique identifiers with their statements. At the risk of beating a dead horse, I'd like to have one last kick at the can for consideration of this position in a future version of the spec. I'm looking to land a job in an LRS factory.
Now sure, these LRS employers can parse the 20 records and see that they describe the same experience - and do something with them. I guess my point is why add this cost and complexity to the overall system when it could be avoided simply by requiring APs to include unique identifiers with their statements - kind of like client-side validation for the xAPI ecosystem. I'm not convinced that this a significant marginal cost for APs or that it will greatly reduce accessibility to the system. Fortunately most APs are not lazy like me and will follow best practices and include unique identifiers in their xAPI statements. |
The above is only true for your (and related) use case(s). But let's follow your logic: Why not follow better practices for distributed systems and:
Apologies for the snark, but at this point, it really does seem like a "learn the spec and be smart about implementation" sort of problem. And FWIW, I do work in an "LRS Factory" or the formal equivalent, and if you passed the same statement to 50 LRSs from a single AP w/o realizing "oh, hey, this is the use case where I should include a UUID", you'd just have disqualified yourself as a candidate 😸 |
I'm going to try and summarize the key points so we can discuss on a future call. Stuff we agree on:
Possible options: I would like to add a sentence or two explaining why the AP should ideally generate the statement id and why it's especially important if sending the statement to multiple LRSs. I think it's a not-impossible use case that could case headaches for reporting. @brianjmiller would like to leave the spec as it is. APs should be able to figure it out for themselves. @canweriotnow seems to be (correct me if I'm wrong) arguing for recommending that the LRS generates the statement id and outlined some benefits of that case. |
@garemoko I'm not for disallowing AP id generation, but I'm arguing the case for LRS id generation as allowed, and perhaps preferred, practice. Pretty much in line with @brianjmiller, I think. If I'm against anything, it's designing APs to send to multiple LRSs by default; there should be an intermediary redundancy check, at least. I can envision scenarios in which for instance, tons of sensors have tons of potential "hubs" (for lack of a better word) to which they send data; multiple hubs might have received duplicate data from sensors in range; therefore the logical solution is to have them forward that data to a location actually responsible for crafting the statements and sending them to an LRS. Of course, this is an inverse of @rswetnam's problem; he describes a conventional system broadcasting data to LRSs around the globe. I see two solutions:
Now, I'm not in principle against (1), in fact, it's pretty sexy as distributed data infra goes. But I don't think it's necessary. As for (2), that's easy. IF YOUR AP IS GOING TO PUBLISH TO >1 LRS GENERATE A UUID if not, the spec just works as is. If you don't know in advance which is the case, FIND OUT BEFORE DEPLOYING YOUR FRAKKING AP. I'm happy to discuss this on the next call if necessary, but my gods, of all the issues we need to deal with in this spec, this should be just under "what color do we paint the bikeshed?" |
FWIW, https://github.com/pubsubhubbub is a pub/sub model and it is referenced (maybe even used?) by the Federal Learning Registry, which was conveniently developed as a predecessor to xAPI. As @canweriotnow asserted, the very notion of this is definitely a 2.0 topic for discussion and possibly consideration. |
Great discussion! @rswetnam this is why new blood is so important to really get these things thought through. I think all of the points have been made, here's where I line up:
|
Thanks @garemoko for excellent summary here. I see that the ideas I am putting forward involve breaking changes and are more suitable for discussion on the next major semantic version of the spec. Is there an area that is more appropriate where are could put forward this argument as I do not want to muddy discussions about the current minor version? For that venue, I would like to make the following points:
Let me say that I am not advocating for some sort of Jeffersonian democracy of small independent LRSs. I'm just saying that we should consider the possibility that building additional complexity into the system by not requiring generators of learning statements to include UUIDs with them may have consequences that we have not fully thought out. That is perhaps a discussion for another forum. Finally, @canweriotnow Jason Lewis I would request that you tone down the personal attacks against me. While getting the other kids in the playground to shout out bikeshed colours is effective in showing your contempt for me and my tedious arguments, I don't think it adds to the level of discussion in this forum. I would however, be interested in learning more about how small-powered devices which might not be able to generate UUIDs fit into the system that you mentioned earlier in this thread. |
There are some interesting effects caused by the way in which the spec handles identity, equality and immutability. Two statements with different id's could be equal. Given that equality is not defined by two statements having the same ID, it is difficult to see how it would be possible to determine if the same statement existed in two different LRS's. Interestingly I have seen LRP's send the same statement to an LRS multiple times but given that the LRS has no way of testing if the same statement has already been stored, it simply stores it again. Of course letting the learning record provider set the id presupposes that it is capable of generating UUID's in a way which avoids collisions. In the past, I have seen LRP's which frequently send the same ID. Personally I would like to see a future version of the spec either require the learning record provider to set the ID or prohibit the activity provider from setting the ID. Allowing both the activity provider and LRS to set the ID seems unnecessary. It's interesting to note that cmi5 requires activity providers to set the ID in statements. |
Section 2.4 says describes an id as:
It seems to me that this could cause problems. Let's say I am a small communitycollege and I have an activity provider that generates statements but does not assign UUID's to those statements. Let's say I am responsible for sending my statements of learning experience to the regional LRS and the national LRS. This means that national and regional LRS' would have different uuids for the same learning statements that I generated. I see this as problematic.
The text was updated successfully, but these errors were encountered: