classact v2.24.8
Classification Apps
:toc:
Introduction
This repository contains the code for applications to classify, compare and blueline enrolled bills. The apps are separate routes of a single-page browser-based application. The application is built with an html/css/js Angularjs front-end, a REST-based MongoDB data layer, and a NodeJS service. Each of these is described in the <> section below.
Quick Start
- Clone
Xcential/LRC-Classification
from Github - Run
./setup.sh
to check dependencies. - Install any missing dependencies reported by
setup.sh
. - For local development it's sufficient to run MongoDB (>=4.0) with just a path to the data folder:
E.g.
mongod --dbpath=/path/to/mongo/data/dir
. For more info see Mongo's resources- https://www.mongodb.com/download-centerMongoDB[Download]
- https://docs.mongodb.com/manual/tutorial/install-mongodb-on-windows/#configure-a-windows-service-for-mongodb-community-edition[Instructions to run as a service on Windows]
https://docs.mongodb.com/manual/tutorial/install-mongodb-on-windows/#start-mdb-edition-as-a-windows-service[Start as a service on Windows, >=v4.0] (see link:StartMongoDBService.pngWindows Services list screenshot)
- Set up RESTHEART >=3.6.2:
- https://github.com/SoftInstigate/RESTHeart/releases[Download] and install http://restheart.org[RESTHEART]
We expect the port to be
8007
, you can run using our provided dev config file with:./bin/restheart.sh /path/to/restheart.jar
- Set up sample data:
- Sample data is stored as Mongo DB dumps in S3.
./setup.sh
will warn you if your AWS CLI is setup incorrectly or you don't have the correct permissions.- Import sample data with
./bin/restore-classact-mongo.sh
Copy
app_flask/localsettings.sample.py
toapp_flask/localsettings.py
Quick Start AWS CENTOS 7.6
- Set up new CENTOS 7.6 instance from https://aws.amazon.com/marketplace/pp/B07MXGDZG4 (AWS
t3a.large
EC-2) SSH in to the instance with
ec2-user
sudo userInstall subscription-manager, register subscription requires username and password for Redhat developer subscription
$ sudo su
$ subscription-manager register
- Attach pool to subscription manager
$ subscription-manager list --available
$ subscription-manager attach --pool={{POOL ID}}
- Add repos and update packages (under
sudo su
):
$ subscription-manager repos --enable rhel-7-server-extras-rpms
$ subscription-manager repos --enable rhel-7-server-optional-rpms
$ rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ yum-config-manager --enable rhel-server-rhscl-7-rpms
$ yum-config-manager --enable rhel-server-rhscl-beta-7-rpms
Update instance and install git:
$ sudo yum update
$ sudo yum install -y git wget curl
- Install Mongodb (see https://linuxize.com/post/how-to-install-mongodb-on-centos-7/)
Create a file
/etc/yum.repos.d/mongodb-org.repo
with the following contents:
[mongodb-org-4.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.0.asc
Run
sudo yum install -y mongodb-org
Start Mongodb service
$sudo service mongod start
Redirecting to /bin/systemctl start mongod.service
Add 'Restart=always' to settings for systemctl:
- Edit mongod service:
sudo vi
/lib/systemd/system/mongod.service` - Add
Restart=always
under Service - Reload systemctl daemon:
sudo systemctl daemon-reload
- Edit mongod service:
Install NGINX (see https://www.cyberciti.biz/faq/how-to-install-and-use-nginx-on-centos-7-rhel-7/)
Create a file
/etc/yum.repos.d/nginx.repo
with the following contents:
[nginx]
name=nginx repo
baseurl=http://nginx.org/packages/mainline/centos/7/$basearch/
gpgcheck=0
enabled=1
Run
sudo yum install -y nginx
The
nginx
user may already be set up. If it is not, create one.Install RESTHEART v>=4.0
$ wget -qO- https://github.com/SoftInstigate/restheart/releases/download/4.1.8/restheart-4.1.8.tar.gz | tar xvz
- Run RESTHEART >=4.0 to v=3 settings (after mongodb is running)
Make the following changes to etc/restheart.yml
:
https-listener: true
https-host: 0.0.0.0
https-port: 4443
http-listener: true
http-host: 0.0.0.0
http-port: 8007
ajp-listener: false
ajp-host: 0.0.0.0
ajp-port: 8009
and
mongo-mounts:
- what: /uscongress
where: /uscongress
- what: /classification
where: /classification
See https://github.com/softInstigate/restheart#summary
$ cd restheart-4.1.8
$ java -jar restheart.jar etc/restheart.yml -e etc/bwcv3.properties
- Run RESTHEART Security (after mongodb is running)
$ cd restheart-security-1.3.2
$ java -jar restheart-security.jar etc/restheart-security.yml -e etc/default.properties
- Install Python3 and pyenv virtualenv
Install Python (3.6) and pip
$ sudo yum -y install rh-python36
$ scl enable rh-python36 bash
$ sudo env "PATH=$PATH" pip3 install --upgrade pip
- Install Pyenv
$ curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer | bash
$ echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
$ echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
$ echo 'eval "$(pyenv init -)"' >> ~/.bashrc
- Restart the shell:
$ exec "$SHELL"
- Add the pyenv-virtualenv plugin
$ git clone https://github.com/pyenv/pyenv-virtualenv.git $(pyenv root)/plugins/pyenv-virtualenv
- Add pyenv virtualenv-init to the shell to enable auto-activation of virtualenvs.
$ echo 'eval "$(pyenv virtualenv-init -)"' >> ~/.bash_profile
- Restart the shell again:
$ exec "$SHELL"
- Install Python 3.7-dev virtualenv:
$ sudo yum install -y zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel \
openssl-devel xz xz-devel libffi-devel
$ sudo yum groupinstall "Development Tools"
$ pyenv install 3.7-dev
$ pyenv virtualenv 3.7-dev v37dev
$ pyenv activate v37dev
- Install AWS CLI
$ pip install awscli
- Install AWS CLI (use credentials that have access to the Mongodb and bill data s3 stores)
$ aws configure
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]: json
Install Java >=11
$ sudo yum install java-11-openjdk-devel
- Select this version of Java and set
$JAVA_HOME
in.bash_profile
(as described here: https://www.liquidweb.com/kb/install-java-8-on-centos-7/)
Install NodeJS and NPM
Install Node, NPM and the 'n' version manager (NodeJS >= 12).
$ sudo yum install -y gcc-c++ make
$ curl -L https://git.io/n-install | bash
- Create
/var/app
and clone this repository there
$ mkdir /var/app
$ cd /var/app
$ git clone https://https://github.com/Xcential-Corporation/LRC-Classification.git
- Install NodeJS libraries and dependencies in LRC-Classification:
$ cd /var/app/LRC-Classification
$ npm install
NOTE: it is necessary to register with the github npm package manager. It may be necessary to remove package-lock.json before installing.
- Install Python requirements:
$ cd /var/app/LRC-Classification
$ pyenv activate v37dev
$ pip install -r requirements.txt
$ cd /var/app/BillProcess
$ pip install -e ./
$ cd /var/app/bill-importer-govinfo
$ pip install -e ./
Install pm2
service monitoring globally: npm install -g pm2
(this will be used for the Node services, pm2 start bin/services.js
- Set environment variables. Add to
~/.bash_profile
:
# where House-Amendment-Parse, services and other directories go
export MAIN_ROOT_PATH='/var/app'
# where the bill data is stored
export BILLDATA_ROOT_PATH='/public/data'
Then
$ source ~/.bash_profile
- Make this available to all users (for the crontab scraper)
Create a file in /etc/profile.d/classactvars.sh
with the following content:
export MAIN_ROOT_PATH='/var/app'
export BILLDATA_ROOT_PATH='/public/data'
- Install Github repos in
/var/app
:
$ cd /var/app
$ git clone https://github.com/Xcential-Corporation/BillProcess.git
$ git clone https://github.com/Xcential-Corporation/bill-importer-govinfo.git
$ git clone https://github.com/Xcential-Corporation/House-Amendment-Parse.git
- With the pyenv
v37dev
activated, go intoBillProcess
andbill-importer-govinfo
directories above and install the modules:
$ cd BillProcess
$ pip install -e ./
$ cd ../bill-importer-govinfo
$ pip install -e ./
- Install cmake See https://gist.github.com/1duo/38af1abd68a2c7fe5087532ab968574e
Install and build House-Amendment-Parse
tools
$ cd /var/app/House-Amendment-Parse
$ ./make_all.sh
NOTE: If you get an error that cmake3
is not found, change 'cmake3' to 'cmake' in the make_all.sh
script and run it again.
Create a /public/data
directory
Mount a large (>150G) drive to
/public
.Copy data from
s3://us-bills/data
to/public/data
Set ownership of /public/data
to the local user, then:
$ aws cp s3://us-bills/data /public/data --recursive
(This will take ~1 hour)
NOTE: For a smaller data set (e.g. for testing purposes) copy only the following files
$ aws cp s3://olrc-classact-backup-bills/data/117 /public/data/117 --recursive
This creates data within /public
with the following structure:
- data
-- 117
--- dtd
--- pdf
--- uslm
-- 116
--- dtd
--- pdf
--- uslm
-- 115
--- dtd
--- pdf
--- uslm
-- 114
--- dtd
--- pdf
--- uslm
...
-- 99
--- dtd
--- pdf
--- uslm
- Copy ClassAct backup mongodb
s3://olrc-classact-backup/class-act-mongo-dump-[collection]-[date].zip
to/public/mongo-dump
Set ownership of /public/data
to the local user, then:
$ mkdir /public/mongo-dump # If this directory does not already exist
$ aws cp s3://olrc-classact-backup/class-act-mongo-dump-uscongress-2021-02-17.gz /public/mongo-dump
$ aws cp s3://olrc-classact-backup/class-act-mongo-dump-classification-2021-02-17.gz /public/mongo-dump
$ mongorestore --gzip --archive=/public/mongo-dump/class-act-mongo-dump-uscongress-2021-02-17.gz
$ mongorestore --gzip --archive=/public/mongo-dump/class-act-mongo-dump-classification-2021-02-17.gz
- Set server to allow proxy to ports. (see
https://stackoverflow.com/a/25277699/628748
)
$ sudo /usr/sbin/setsebool httpd_can_network_connect -P true
$ chcon -Rt httpd_sys_content_t /var/app/LRC-Classification/app_static
$ chcon -Rt httpd_sys_content_t /public/data
This may also be required, if the form above does not work:
$ sudo chcon -v -R --type=httpd_sys_content_t /var/app/LRC-Classification/app_static
$ sudo chcon -v -R --type=httpd_sys_content_t /public/data
Set up SSL (for https)
https://certbot.eff.org/lets-encrypt/centosrhel7-nginx.html
Serving with Nginx
The following server configuration for Nginx listens on port 443. It also creates a proxy to RESTHEART, which itself provides an API to mongodb.
If the certbot above is not set up or working, set the server to listen on 80
, and remove the server redirect from 80
to 443
.
At one point, after setting up the certificate and re-running Nginx you may get an error "Unregistered Authentication Agent". I found that deleting /tmp/mongodb-40017.sock
helped (see https://stackoverflow.com/a/53837662/628748). SELinux settings may also need to be updated to permit proxy from 443.
server {
listen 80 default_server;
listen [::]:80 default_server;
server_name _ server_name class.olrcdev.xcentialcorp.com class.olrc.xcentialcorp.com;
return 301 https://$host$request_uri;
}
server {
listen 443 default_server;
listen [::]:443 default_server;
server_name class.olrcdev.xcentialcorp.com class.olrc.xcentialcorp.com;
client_max_body_size 4G;
# Load configuration files for the default server block.
include /etc/nginx/default.d/*.conf;
location / {
add_header REMOTE_USER $remote_user;
root /var/app/LRC-Classification/app_static/;
}
location ^~ /data/ {
alias /public/data/;
}
location = /mongodb/ {
return 302 /mongodb/;
}
location / {
root /var/app/LRC-Classification/app_static/;
add_header REMOTE_USER $remote_user;
}
location /py/billToBill {
rewrite ^/py/billToBill /billToBill break;
proxy_pass http://127.0.0.1:5000/billToBill;
proxy_redirect off;
}
location /py/billUpload {
rewrite ^/py/billUpload /billUpload break;
proxy_pass http://127.0.0.1:5000/billUpload;
proxy_redirect off;
}
location /getBillUploadResult {
proxy_pass http://127.0.0.1:5000/getBillUploadResult;
proxy_redirect off;
}
location /services/ {
rewrite ^/services/(.*) /$1 break;
proxy_pass http://127.0.0.1:2700/services$uri$is_args$args;
proxy_redirect off;
}
location /mongodb/ {
rewrite ^/mongodb/(.*) /$1 break;
proxy_pass http://127.0.0.1:8007$uri$is_args$args;
proxy_redirect off;
}
# redirect server error pages to the static page /40x.html
#
error_page 404 /404.html;
location = /40x.html {
}
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
}
}
Start Nginx
$ sudo systemctl enable nginx
$ sudo systemctl start nginx
Serving on IIS
Serving the app on IIS requires (a) the static UI (app_static), (b) the aspx whoami service (from LRC-Phase-3/services
repository) (c) services for creation of output XML (app_flask for now, moving to NodeJS services) (d) a reverse proxy for the RESTHEART service. To set this up:
Create a new web site. Name it 'LRCApps', bind it to your domain (e.g.
class.linkedlegislation.com
) and define the path to app_static. This site will be served statically.Create an application to serve the whoami.aspx file from
LRC-Phase-3/services
Check /services/whoami
ensure that it is working.
NOTE: The IIS configuration can be frustrating since configuration settings are split between files. While most of the configuration is in the web.config files of app_static and app_flask, some configuration must be done through the IIS Manager and ends up in applicationHost.config
within C:\Windows\System32\inetsrv\config
.
The rewrite module, which is used for the reverse proxy, is described in this http://stackoverflow.com/a/6741094/628748[StackOverflow answer] provides some examples, and https://www.iis.net/learn/extensions/url-rewrite-module/url-rewrite-module-configuration-reference[this page] is a good general reference on IIS rewrite configuration. More about setting up a reverse proxy https://weblogs.asp.net/owscott/creating-a-reverse-proxy-with-url-rewrite-for-iis[here].
NOTE: Add the following settings to restheart.yml, which seems to have fixed a problem where the first query to mongodb/restheart would be returned with a 502 error from the proxy:
local-cache-enabled: true
# TTL in milliseconds; specify a value < 0 to never expire cached entries
local-cache-ttl: -1
# Limit for the maximum number of concurrent requests being served
requests-limit: 1000
# The idle timeout in milliseconds after which the channel will be closed.
# If the underlying channel already has a read or write timeout set
# the smaller of the two values will be used for read/write timeouts.
# Defaults to unlimited (-1).
IDLE_TIMEOUT: -1
# The maximum allowed time of reading HTTP request in milliseconds.
# -1 or missing value disables this functionality.
REQUEST_PARSE_TIMEOUT: -1
Build
Build Classification Apps. Build options:
$npm run build
:: General build. Updates bower libraries, if needed; combines and minifies css and libraries; copies files to their production location; applies template patterns to html templates. Build should be done when any html files are changed in /src
$grunt docs
:: Performs a general build as above. In addition, produces a Changelog from Github tags and closed issues and converts Changelog and other documentation to pdf and html.
TODO: Convert grunt docs
to npm
Development
Front End
JS: Libraries
The application is built on the https://angularjs.org/[AngularJS Framework v1.x]. Javascript library dependencies are managed with https://bower.io/[bower] and listed in static/bower.json
(see <> below). These include the following
Angular-related libraries:
width="100%",options="header"
|====================
| Library | Description | Configuration | Usage
| https://ui-router.github.io/[ui-router] | url routes and parameters | Configure initial route state, $stateProvider.state
, in static/js/config.js
. Define additional state transition functions as callbacks of $scope.$watch('$state.current.name'...
in static/app.js
. | State and parameters are defined in the browser url (e.g. compare.html?billnumber=115-1
), or in javascript through the $state.go() function. For example, transition to the compare route, with defined parameters:
$state.go('compare',{'billnumber':'115-1', 'currentUser':'comparefinal'}, {notify: false})
| http://ui-grid.info/[ui-grid] | Feature-rich datagrid library | Define column settings for each route in static/js/uiGridConfigFactory.js
| Set the UIData
object, after loading data from REST query, in static/js/makeUIDataFactory
. The UIData object should match the structure expected by the grid configured in uiGridConfigFactory.js
.
| https://angular-ui.github.io/bootstrap/[angular-bootstrap] | UI styling library, based entirely on angularjs (+ bootstrap.css) | N.A. | Use angular-bootstrap components by setting html elements and attributes in /src
| http://chieffancypants.github.io/angular-hotkeys/[angular-hotkeys] | Add keyboard shortcuts and a shortcut list | Define hotkeys in static/js/sethotkeys.js
. Many of these refer to functions in static/js/app.js
| Display hotkeys with ?
| https://github.com/aih/angular-boxy.git[angular-boxy] | Small angular directive for layout and resizable divs | Define directive attributes and classes in src/rootView.html
| Use bx-resize, bx-resize-border and bx-split-with attributes, as defined in the https://github.com/aih/angular-boxy.git[angular-boxy] repository.
| https://github.com/aih/angular-xreader.gi[angular-xreader] | Angular directive for a read-only xml div | Add directive attributes to html (e.g. in src/panels/billPanel.html
| Use xr-reader
, xmlsource
and xrpre
attributes, as defined in the https://github.com/aih/angular-xreader.git[angular-xreader] repository. When the variable defined by xmlsource
is set to XML text, that text is displayed in the div with stylesheets set by the xrstylesheets
attribute.
|====================
JS: Routes and Grid
HTML: Layout
The main view file for the apps is /app_static/views/rootView.html
. In addition to the root elements and resource links, this file includes the basic layout for all of the apps. The layout uses an open-source angular-boxy
library, developed by Xcential. The library includes CSS definitions for containers absolute positioning, and directives for resizing and split-container resizing.
The rootView.html file defines an app with the following layout:
width="100%", options="header" |==================== 2+^| Navbar <| East Panel >| West Panel 2+^| South Panel | |====================
Each of the panels can be collapsed, and can be resized by adding a bx-resize
attribute on the element (decribed in more detail in the angular-boxy repository). In the Classify, Compare and Blueline apps, the 'South Panel' is collapsed entirely.
HTML: Templates and Routing
The app html is built from modular templates. The final templates are stored in the /app_static/views
directory. To override a template, development should be done in /app_static/apps
, for example /app_static/apps/classify/classify.html
or app_static/apps/compare/views/modals/finalCommentsModal_compare.html
. The templates from /app_static/apps
are copied during the GruntJS build to the /app_static/views
directory, which is where they are served from by Flask.
Within rootView.html are elements with ui-view
attributes. The contents of those elements are determined by routes in ui-router, as defined in app_static/apps/[app]/config.js
. Thus, for example, the basic layout of the Classify app, defined in app_static/apps/classify/js/config.js
, has a grid in the West Panel and a bill viewer in the East Panel. A few substitutions are made in the Navbar, using the views
definitions in app_static/apps/classify/js/config.js
.
The state of UI components can be determined by state parameters ($scope.$stateParams
). For example, the collapsed/uncollapsed state of the bill view panel is toggled by the billCollapsed
parameter, which can be set as a url query parameter:
[baseurl]/compare.html?billCollapsed=1
will initialize the app with the bill panel closed not yet impemented.
In general, if the presence or absence of a UI feature is constant in an app, it is set through the ui-view routes. If the state of a feature is toggled within an app, its state is set in $stateParameters.
Services
The application back-end consists of an express-based NodeJS service, defined in service.js
. It calls functions in services-js
.
There is also a flask
app, flask_app.py
, for functions like copying data from one bill to another. These services are defined in app_py
.
Annotated bill (/services/annotate:billname?user=[username]
)
For each bill/user combination, an 'annotated bill xml' can be created, that consists of the original bill XML, enriched with classification data and bill metadata in notes (e.g. sidenote
, footnote
, etc.), and styled with CSS for the annotated bill. For example:
http://class.olrcdev.xcential.com/services/annotate/116hjres31enr.xml?user=comparefinal
To retrieve just the JSON, consisting of the bill classification data combined with metadata (stat. page, P.L. number), if available, add the json
parameter with a value, e.g.:
http://class.olrcdev.xcential.com/services/annotate/116hjres31enr?user=comparefinal&json=true
TODO: describe services and other routes
Data
NOTE: The Statutes at Large pages for each Public Law are linked from this page: https://www.archives.gov/federal-register/laws/current.html
This data is not currently scraped by ClassAct, rather the page numbers can be entered manually by users at /dash.html
Bill Files
The bills themselves, converted to USLM and transformed with UI elements, are downloaded and processed with the applications in the BillProcess
repository.
MongoDB
The list of all bills, and the data for each bill are stored as json in https://www.mongodb.com/[MongoDB], a document database. MongoDB is accessed in Python from mongofunctions.py
, which uses the pymongo
library.
RESTHEART API
The data is accessed from the UI through REST queries to RESTHEART (see https://softinstigate.atlassian.net/wiki/display/RH/Resource+URI[query documentation]), a java based interface. For example:
List of bills for Congress 115::
[baseurl]/classification/bill_info?filter={'congressnumber':'115'}
All classifications for HR39enr (P.L. 115-1)::
[baseurl]/classification/bills?filter={'congressnumber':'115','billnumber':'hr39','data':{'$ne':[]}}}
.
On IIS, the RESTHEART API is accessed through a reverse proxy at the url [baseurl]/mongodb/
, so bills collection is at [baseurl]/mongodb/classification/bills
.
Annotated bills
Bill XML, annotated with classification data, can be retrieved through the application's NodeJS service
Users
Each 'document' in the bills
collection (see below) is uniquely defined by a combination of billnumber
(e.g. 116hjres31enr
) and user
(e.g. lskouras
). The structure of the document is described in more detail below; in essence, it contains the classifications and other data for each provision of the bill.
For each bill, there are three initial users defined: auto
, comparefinal
and bluelinefinal
. The first of these serves as a template for the others. It contains all of the provisions of the bill, and an automatically generated classification for any provision that has an 'action phrase' as part of an amendment (e.g. 'by striking "this" and inserting "that"'). When an OLRC user (e.g. lskouras
) logs into the application and imports a bill for the first time, a copy of the auto
document is made with the bill+user combination. (e.g. {'billnumber': '116hjres31enr', 'user':'lskouras'...}
). The documents for the comparefinal
and bluelinefinal
users have the same structure as all of the others, and contain the classifications added by a reviewer at those stages of bill processing.
Data Structures
The MongoDB database for these apps is called classification
and contains two MongoDB collections:
bill_info::
The bill_info
collection stores one document for each bill ('hr39enr' and 'hr39eh' are stored as separate documents). The document contains the following structure:
source, javascript
{ _id: {$oid: "58ac24260f13750af4278595"}, latestaction: {$date: 1484870400000}, billnumber: "hr39", pagelength: "3", congressnumber: "115", publaw: "115-1", enrolled_filename: "BILLS-115hr39enr.xml", enactdate: "01/20/2017", enrolled_url: "https://www.congress.gov/115/bills/hr39/BILLS-115hr39enr.xml"
}
bills::
The bills
collection stores one document for each user-bill combination. For each bill, three documents are automatically created in the LRC-Phase-3 project from importer.py, one for auto-classifications, one for the 'comparefinal' user and another for the 'bluelinefinal' user. The documents have {'user': 'auto'}
, {'user': 'comparefinal'}
and {'user': 'bluelinefinal'}
respectively. Classifications created by each attorney are stored in a document with their user name (e.g. 'jwagner'). When classifications and bluelines are finalized in the UI, they are stored in separate documents, with usernames defined in the AUTHORS
object of static/js/config.js
.
The AUTHORS
constants are defined as follows:
source, javascript
'AUTHORS':{ 'CLASS_AUTO_AUTHOR':'auto', 'COMPARE_FINAL_AUTHOR':'comparefinal', 'BLUELINE_FINAL_AUTHOR':'bluelinefinal', 'DEV_USER':'olrcdev'
}
A sample bills collection document can be retrieved from the REST API at [baseurl]/mongodb/classification/bills?filter={"user":"lskouras", "billnumber":"116hjres31enr"}
. The json includes
a. metadata, which is also reflected in the `bill_info` document for the bill.
b. `user`, which is either an automatically created user (e.g. `auto`), or an OLRC user with the data they have entered
c. `datakeys`, which is an ordered list of the ids for bill provisions.
d. `data`, which is a python dict (json object) with provision ids as the key and data about the provision as the value. The datakeys list above maintains order, since the order of keys cannot be guaranteed, either in the dict or json object. The data object includes information about the hierarchical level of the provision (`treeLevel`), display of the provision in the bill document (`showBlueline`), as well as information gathered from parsing the text for classification. The 'targetProposedSection' property contains the classification of the provision, if any. Additional properties store data about, for example, merging of provisions or bluelining.
e. `range`, which contains all of the unenumerated and merged provision ranges. This allows the UI to restore these ranges when the bill is loaded (e.g. as highlighted text).
There are three different kinds of provisions stored in data
. All provisions of the original bill are included with their id as the key. In addition, there are two kinds of user-created provisions: unenumerated
and merged
.
Unenumerated provisions identify a contiguous range of text within the bill. These may overlap hierarchical provisions of the bill. They have a unique id created by combining a container id + '-u-' + a hash that consists of the first and last words of the phrase and the ordinal position of these words within the container. E.g. H8F003FE3EB5543B3850460B8C38DB194-u-That_13_improvements_1_len_322
is found within the container with id H8F003FE3EB5543B3850460B8C38DB194
, starting with the 13th 'That' and ending with the first 'improvements', having a total length of 322 characters. Unenumerated provisions are uniquely defined by the text_offset
, which is the character offset from the beginning of the document.
Merged provisions are created from a continguous range of existing provisions (e.g. section 123(a) through section 124(c)). Their id is the id of the first provision in the range + '-m' + the id of the last provision in the range.
The comparefinal
author contains the classification data after a reviewer considers all individual classifications and decides on a final disposition for each provision of the bill.
The bluelinefinal
contains data at a more granular level, used for creating cards. When the bluelinefinal items are created, the XML for the selected provision (including parent provisions to the section level) is stored in xml_card
.
The document as a whole has the following structure:
source, javascript
{ _id: { $oid: "5ced2b14d76d712d9887fb7f"}, billnumber: "116hjres31enr", user: "lskouras", billCongressTypeNumber: "116hjres31", billCongressTypeNumberVersion: "116hjres31enr", bill_info_ref_id: {…}, citation_contexts: [], data: {... H7331867E001A4653838DAF46B44F8ED8: { id: "H7331867E001A4653838DAF46B44F8ED8", identifier: "/dA/tII/s225", text: "", label: "dA tII 225", billpage: "13", treeLevel: 3, targetSection: "t49/s44901 nt new", target: "", author: "lskouras", comments: { classComment: "", execComment: "DT (sec." }, text_offset: "35004:40048" }, ..., H8F003FE3EB5543B3850460B8C38DB194-u-That_13_improvements_1_len_322: { id: "H8F003FE3EB5543B3850460B…_improvements_1_len_322", treeLevel: 2, textOrig: "That appropriations here…ldings and improvements", xml_card: "That appropriations here…ldings and improvements", text: "That appropriations here…ldings and improvements", textLength: 322, firstWord: "That", lastWord: "improvements", ordinalMatch: 13, ordinalMatchLastWord: 1, identifier: "/dB/tI/That_13_improvements_1_len_322", containerPath: "//u:main1/u:division2/u:title1", containerOffset: {…}, textHash: "That_13_improvements_1_len_322", label: "dB tI That appropr...improvements", isUnenumerated: true, containerId: "H8F003FE3EB5543B3850460B8C38DB194", closestProvisionId: "H8F003FE3EB5543B3850460B8C38DB194", closestProvisionIdentifier: "/dB/tI", range: {…}, billpage: 36, author: "lskouras", targetSection: "t7/s2254 new", comments: {…} }, ..., H190E3D49159B466CAA40B44E9EB298E1-m-HD9686292312143E897C13C4D3783D6A0: { id: "H190E3D49159B466CAA40B44…12143E897C13C4D3783D6A0", identifier: "/dF/tVII/s7057/a/-m-/dF/tVII/s7057/e", text: "tSec. 7057.(a) Authority…itation set forth in su", label: "dF tVII 7057(a) - dF tVII 7057(e)", billpage: "359", treeLevel: 4, targetSection: "t22/s3948 nt new", target: "", text_offset: "1065083:1067143", rowIndex: 2125, focusInputFieldDoc: true, mergedFrom: …, range: {…}, comments: {…}, author: "lskouras" } } datakeys: ... "H7331867E001A4653838DAF46B44F8ED8", ... "H8F003FE3EB5543B3850460B8C38DB194-u-That_13_improvements_1_len_322", ... "H190E3D49159B466CAA40B44E9EB298E1-m-HD9686292312143E897C13C4D3783D6A0", ... , lastUpdated: { $date: 1558615937345 }, author: "lskouras", _etag: {…}, markedFinal: true, ranges: {range: "91400:91493", content: "This division may be cit…ropriations Act, 2019”.", id: "H8D71055E369B4C1E977C1BC…-This_1_2019___1_len_93"}, {…}, {…} ...
}
Apps
width="100%"
|====================
| Dash | Overview of bills for each Congress, showing metadata about the bill and processing status | /billtrack.html?congressnumber=115
| Class | Add Classifications for an enrolled bill | /classify.html?billnumber=115-1
or /classify.html?billnumber=hr39
| Compare | Compare classifications by different attorneys for an enrolled bill and set final classifications for the bill |/compare.html?billnumber=billnumber=115-1
or /compare.html?billnumber=billnumber=hr39
| Blueline | Set bluelined provisions | /blueline.html
|====================
Authentication
Linux
Users are authenticated using Nginx Basic Authentication. See https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/
- Verify that
httpd-tools
is installed (RedHat/Centos) and install it if not. - Create the first user:
$ sudo htpasswd -c /etc/apache2/.htpasswd user1
Enter the password at the prompt
- Subsequent users can be created without the
-c
option:$ sudo htpasswd /etc/apache2/.htpasswd user2
server {
server_name class.olrc.xcential.com;
client_max_body_size 4G;
auth_basic "Restricted Content";
auth_basic_user_file /etc/nginx/.htpasswd;
...
}
Windows
Users are authenticated using Windows Active Directory. The Windows user is queried by ClassAct inside the Angular app.run function, using 'getUserName' in https://github.com/Xcential/LRC-Classification/blob/develop/app_static/apps/classify/js/app.js
getUserName()
queries a service url (e.g. /whoami.aspx
) and gets a JSON object in return, which specifies the username. This username is put into a local variable ($scope.$stateParams.currentUser
). Internet service is monitored by querying a static /ping.html
file. If the currentUser
variable is not available, the ping
function also queries whoami
and populates the variable.
NOTE: To create web-only users on Windows: http://stackoverflow.com/a/25210737/628748 +__ +If you create a user with the advanced user management (from command line: netplwiz), then modify the group, remove users, and add IIS_IUSRS. They will be able to authenticate to your web page, but not the computer. +__