content-to-reader v0.0.3
content-to-reader
Extract meaningful content from any website and turn it into an EPUB file. Send it to your device using your Gmail account if you want.
How to install and use
- Install it globally using NPM (or any package manager)
npm i content-to-reader -g
- Generate configuration file template by running
content-to-reader get-config config.yaml
- Edit configuration file and run this command to create an EPUB and/or send it to your Kindle
content-to-reader create -c ./config.yaml
- Enjoy your articles
If you run into any issues refer to FAQ section below.
Use cases
Here are a few use cases and ideas that you may use as a hint
EPUB from a single URL
content-to-reader create https://welldone.com/@user/10_easy_steps_to_whatever
I want to choose what to extract
Sometimes you want to pick elements from a target website yourself or maybe default extraction didn't work well for you. Use selectors.
output: "./news.epub"
pages:
- "https://clickbaitnews.com/article/some_article_12msad1"
- url: "https://welldone.com/@user/10_easy_steps_to_whatever"
selectors:
- name: "Header"
first: ".page-content header"
- all:
".page-content .contents":
[
"h1",
"h2",
"h3",
"h4",
"h5",
"p",
"code",
{ ".custom-tip": ["p", "div", ".some-class": ["a", "p"]] },
]
- first: ".page-content .comment-section"
Selectors let you pick elements from a target website using CSS Selectors. You can select first
or all
queried elements to be included in the final EPUB.
Final EPUB will contain all of the elements found by selectors.
You can generate longer CSS Selectors without repetition using YAML's dictionaries and arrays, for example:
- all:
".page-content .contents": ["h1", "h2", "h3"]
equals
- all: ".page-content .contents .h1, .page-content .contents .h2, .page-content .contents .h3"
You can nest dictionaries in arrays recursively.
Send to Kindle
content-to-reader
allows you to use services like Amazon's "Send To Kindle":
toDevice:
deviceEmail: your_kindle_A3BcD2@kindle.com
senderEmail: your_email@gmail.com
senderPassword: "your password"
pages:
- https://welldone.com/@user/10_easy_steps_to_whatever
If you've never sent to Kindle using email before, there are a few steps to follow in order to make this work.
First, whitelist your email address in Amazon then create application password for your Gmail account so you can use it in .yaml
config file. And that should do it.
Currently only Gmail's SMTP server is supported.
Configuration file template and documentation
# Filename or output path of a result EPUB file. Not required if `toDevice` present.
output: "news.epub"
# In this section you configure automatic sending of a result EPUB file to your device using your Gmail account. Your credentials aren't stored in any way and are used solely for sending a result file to your device. Currently only Gmail is supported. Not required if `output` present.
toDevice:
# This is an email address of your reader device (ex. Kindle reader).
deviceEmail: ""
# This is an email address of your Gmail acccount
senderEmail: ""
# This is an application password for your Gmail account. Read up how to generate one: https://support.google.com/mail/answer/185833?hl=en
senderPassword: ""
# In this section you configure content present in the result EPUB file.
pages:
# You can extract content automatically by passing URL only.
- "https://page.com"
# Or use selectors to pick what you want.
- url: "https://page.com"
selectors:
# You can select first element encountered...
- name: "Header" # Name is not required but it may help debugging
first: ".page-content header"
# ... or all of them.
- name: "Content"
all:
# Use nested selectors to create verbose element queries
".page-content .contents":
[
"h1",
"h2",
"h3",
"h4",
"h5",
"p",
"code",
{ ".custom-tip": ["p", "div", ".some-class": ["a", "p"]] },
]
FAQ
Is your email address known by Amazon? If not then whitelist your email address in Amazon.
Isn't your file too big? Remember that "Send to Kindle" imposes 50mb limit.
Sometimes Amazon just rejects a file for whatever reason. You can use Calibre as a last resort and let it do its magic so Amazon accepts your file. There's a ton of material on this on the Internet.
You can't use your regular Gmail password. Create application password for your Gmail account here: https://support.google.com/mail/answer/185833?hl=en. Now you can use it in .yaml
config file.
Currently there is no way to change this behaviour.
License
Licensed under The Prosperity Public License 3.0.0.
Contributions
Any contributions are welcome. If you have an idea or you spotted a bug feel free to open an issue or a pull request.