Hello, I was wondering if there is any progress on name to ID matching for bills and votes for the v2 API. If there is anything I can help with in speeding up this progress I may be able to help. I have started a fuzzy name to ID matching function; I am not sure if this is the best way to progress depending on how it is handled on the OS side.
We’ll be putting out an updated project roadmap soon, and this is likely #2 on it.
(#1 is the completion of the full text search features recently announced.)
The current plan is for us to add aliases to legislators, since an analysis a couple of years back suggested there aren’t that many for most legislators.
What this would look like would likely be a list of unmatched names, a tool to figure out who they might be, and then a PR (automated hopefully) to the openstates/people repo that adds that alias to that person. There are a few things to figure out (mainly: what if there are alias collisions, do we have any way of coping with that or do they remain unmatched) but that seems like the path forward.
We’d be super glad to have help, and I wouldn’t rule out other approaches if you want to experiment, glad to provide assistance where I can.
@james, I did some experimentation on unknown-legislator-names a few years ago; I wish I could remember more of it.
Do you intend for the tool to run within Django, or external via GraphQL?
This may be obvious, but I found that one could sometimes do surprisingly good work if one first determined the characteristics of roll call vote-sets in a chamber. For example, if one knows the number of seats in a chamber, and can find a vote-set that lists the same number of legislators, then any unknown names in the vote-set must be aliases, replacements, or temporary substitutes for legislators that are “missing” from the vote-set.