1.1.4 • Published 5 years ago
@eirikb/normalize-html-table v1.1.4
normalize-html-table
Normalization of DOM table rows - creates a matrix with duplicate cells based on rowspan and colspan.
Handy for scraping and parsing of Wikipedia tables.
- Vanilla DOM - no dependencies.
Does one job only - rowspan and colspan.
Usage
npm i @eirikb/normalize-html-tableimport normalizeHtmlTable from '@eirikb/normalize-html-table';
const table = document.querySelector('table');
const rows = normalizeHtmlTable(table);
console.log(rows);This will return a matrix of rows and cells. Each cell contains the td element.
Each row will have a property row attached to them, in case you need to reference the original tr element.
E.g.,
console.log(rows[0].row); // tr elementNotes
This library will not:
- Map your table to a JavaScript object.
- Do anything with your headers.
- Convert cells to text.
Support older browsers (you must transpile it).
All above can be solved by you, and does not fit into this library. E.g., converting to JavaScript object with cells turned into text can be done like this:
function tableToJson(table) {
const headers = [...table.querySelectorAll('th')].map(th => th.textContent.trim());
return normalizeHtmlTable(table).map(row =>
headers.reduce((res, header, index) => {
res[header] = row[index].textContent.trim();
return res;
}, {})
);
}For nodejs support use jsdom.