Read: [next] [previous] messageRe: [Cdn-DMCA] Proposal: collaborative comments on the Commons debatesFrom: Russell McOrmond <russell _-at-_ flora.ca> On Tue, 30 Apr 2002 mskala@ansuz.sooke.bc.ca wrote: > * My preferred platform for developing stuff like this is PHP4, with > either mySQL or PostgreSQL. I have both database packages on my system > at home, but it isn't network-accessible. I can post things for > debugging purposes on a friend's machine, www.edifyingfellowship.org - > a "production" site wouldn't be welcome there for traffic reasons, but > testing/debugging would be fine. That system supports only PostgreSQL, > not mySQL, so if we wanted to port from there to a mySQL-based system > elsewhere, then some rewriting and conversion would be necessary. I can host the application, whether written for MySQL or PostgreSQL. I currently only have a MySQL server, but this is because I hadn't yet been motivated to install and configure PostgreSQL on my recent computers (I previously did everything in Postgres95 and earlier ;-) For authoring the project, my preference is to have it up on http://savannah.gnu.org/ Some of the stuff I did previously is up on http://sourceforge.net/projects/campaigntoolz/ > * If this were going to be hosted on a system that already has a user > account database (for phpSlash or similar) then it might be desirable to > write code to connect with those user accounts instead of having a > separate account base for the parliamentary-comment system; then we'd be > spared of having to deal with account management ourselves. We can choose the CMS for the site to match. I have phpSlash up on weblog.flora.{org|ca} , but the only accounts are for posting the articles. No 'user accounts' exist. My preference is to start with phpGroupWare as the base considering it was designed to be modular to allow for modules such as this. http://www.phpgroupware.org/ phpGroupWare is also the basis for the Savanna site, and is being actively developed. > * Disk space requirements: an issue of Hansard, in English, is about > 600K. Double that if we include French as well. Double again if we > store both the "complete" file and the "in 5-minute segments" files; divide > by two, maybe, to account for compression. We could probably fake one > of {complete,segmented} by splitting or joining the other, although > if space is cheap it would be nicer not to have to, because storing > both would allow better synch with the Government site. My guess is > that if we didn't cache the actual text, but only stored the "heading" > information, that would take about half as much space, counting database > overhead. If we *did* cache the text we'd probably still want to store > the headings in a database; my bottom line rough estimate is that we'd > have about 2M of data to store per day of Hansard, plus whatever > comments people add. That's not a huge amount of disk space but is > enough to be worth thinking about; I wouldn't want to have to store it > on ansuz.sooke.bc.ca with my 100M space limit. I think this requires some thinking. I do not see the utility of indexing or storing the entire hansard, just the parts related to our campaigns. If other campaigns want to use our tool and have us host it, they can ask for it. Our tool should have 'access levels'. An "editor level" person should have the ability to grade parts of hansard as "1 - Mentions Free Software ... 3 - related to ICT/copyright/etc ... 5 - unrelated", with the unrelated stuff (the vast majority) just being expired. > * Such a system would need people to be "operators", to keep an eye on it > and make sure everything was going smoothly. I could forsee > vandalism/trolling problems; a "lack of critical mass" problem if we > ever got into a situation where there were no recent comments in the > system; and all kinds of fun when (as always happens eventually, with > systems designed to automatically parse other people's > for-human-consumption postings) the Government Web people changed the > format of the Parliamentary site. Some of this > * What's the copyright on Hansard, and would this violate it in any way? It's all under Crown Copyright - this is itself something we may want to form a position on. Our ability to do this should be protected, and if it is not then we should be fighting this as citizens (separate from this campaign). > * To what extent should or could such a project be bilingual? To the extent our audience is - It would be nice to attract Francophone speakers, but not if it would entail rejecting a larger number of anglophone speakers. I wish we were all multilingual and this didn't matter, but most Canadians are linguistically challenged. > * Programming: I think I can write a parser and basic query script, but I > don't have time and energy to do all of the development for a nice > idiotproof system with all the features I've talked about. Do we have, > or can we recruit, other people who would participate in building it? > I'm actually more concerned about recruiting "operators", because > that's an activity I hate doing, whereas building new stuff is an > activity I enjoy and will do as much as I have time for. We should start small with an 'intent' of what we want to do. We can then set priorities for what should get done. a) Get a database of the ridings set up, with information on current MP's and dates of changes of MP's. b) Allow people (authenticated editors) to make comments about MP's. I don't think I have any interest in allowing anonymous people the ability to post at all unless it is approved by some editor. c) Weblog for hansard references of interest. d) much-later: auto-parsing of hansard, scores, etc... --- Russell McOrmond, Internet Consultant: <http://www.flora.ca/> See http://weblog.flora.ca/ for announcements, activities, and opinions Read the speech on copyright made in 1841 by Thomas Babbington Macaulay - a must-read for creators -even predicted the consumer reaction to Napster -- For (un)subscription information, posting guidelines and links to other related sites please see http://www.flora.org/dmca/ Read: [next] [previous] message List: [newer] [older] articles You need to subscribe to post to this forum. |