aboutsummaryrefslogtreecommitdiff
path: root/README.md
blob: f7f5725c101389b1c319f9d6042c9164f3689d7a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# ebooks

Fediverse ebooks bot using neural networks


## Usage

First, install Python dependencies using your distro's package manager or `pip`: [psycopg2](https://www.psycopg.org), [torch](https://pytorch.org/), [transformers](https://huggingface.co/docs/transformers/index), and [datasets](https://huggingface.co/docs/datasets/). Additionally, for Mastodon and Pleroma, install [Mastodon.py](https://mastodonpy.readthedocs.io/en/stable/), and for Misskey, install [Misskey.py](https://misskeypy.readthedocs.io/ja/latest/). If your database or platform isn't supported, don't worry! It's easy to add support for other platforms and databases, and contributions are welcome!

Now generate the training data from your fediverse server's database using `python data.py -d 'dbname=test user=postgres password=secret host=localhost port=5432'`. You can skip this step if you have collected training data from another source.

Next, train the network with `python train.py`, which may take several hours. It's a lot faster when using a GPU. If you need advanced features when training, you can also train using [run_clm.py](https://github.com/huggingface/transformers/blob/master/examples/pytorch/language-modeling/run_clm.py).

Finally, create an application for your bot account and generate an access token. Run the bot with `python bot.py -b server_type -i fediverse.instance -t access_token`. You can omit `-b server_type` for Mastodon and Pleroma. To run the bot periodically, create a cron job. Enjoy!


## Resources

- https://closeheat.com/blog/pytorch-lstm-text-generation-tutorial

- https://trungtran.io/2019/02/08/text-generation-with-pytorch/

- https://huggingface.co/docs/transformers/training

- https://huggingface.co/blog/how-to-generate

- https://github.com/huggingface/transformers/tree/master/examples/pytorch/language-modeling