Some on Slack created a #haiku channel, and that’s reminded me I’ve always wanted to make a haiku bot. I’ve decided to just extend an old markov generator slack bot I made a while back, using this as an opportunity not just to do the fun haiku stuff but also get some experience with organizing a python project (versus a messy one-off scrip).

For a little background…

  • Instead of just forking some random haiku detector repo (there are quite a few), I just decided to do a “guided” implementation. i.e. taking this guy’s approach to haiku detection. He derives syllable information from the CMU Pronouncing Dictionary, resulting in a static dictionary of words/syllables (versus some sort of smarter natural language thing).
  • I finally created a github repo for it. See: https://github.com/andreimarks/ambot-slack . Unfortunately, I was a little bit lazy w/r/t setting up even the original local repo. I think I just wrote the original version of the script in a feverish evening/afternoon and set the repo up as an afterthought later. The markovslack bot is actually in a broken state, unfortunately. But I suppose after I’m finished with the haiku bot I might be motivated enough to go back and fix it.

Goals

  1. A working haiku bot module. MVP is a bot that checks all incoming messages for haikus and posts them to the #haiku channel, as well as
  2. Clean up the ambot repo so that it’s clean and modular.
  3. Move the json files that are storing all the slack messages I scrape (I don’t have admin access for a full export) over to a real database (probably will use sqlite3).
  4. Fix markovslack back up.

Haiku Module

Sort of starting up again in the middle of everything. I have had to do a ton of super basic syntax googling for python, haha. It’s very use it or lose it.

The current iteration is basically just a text-searcher that goes through every possible syllable set in the text it’s given. It generates a lot of crap because I’m not starting searches at the beginning of clauses, so I’d say 90% of the haikus found are really nonsensical because they are just internal snapshots of sentences. Mainly I’m surprised at how many 5-7-5 combinations you can find in a text. In any case, a pass through the babel fish intro in Hitchhiker’s Guide to the galaxy did pull up:

It proves you exist,
and so therefore, by your own
arguments, you don't.

Next step is hooking the bot up to Slack messages. Thinking that this is where I should start getting into point 2, which would be restructuring the repo in such a way that I’ve got the haiku module, he markov module, the slack bot module, and the slack message archive module. Otherwise, the first step is just having the bot log in and read all incoming messages.

Structuring the Project

Just going to run with whatever this article suggests, really. This linked article on modules was sort of helpful, but a little bit in the weeds. Dead Simple Python actually had the right amount of nitty gritty/explanations/example code so I mainly templated that. Ended up with (learned about tree!):

.
├── README.md
├── ambot
│   ├── __init__.py
│   ├── __main__.py
│   ├── app.py
│   ├── haiku.py
│   └── markov.py
└── data
    ├── cmudict-0.7b.txt
    ├── json

Just going to leave everything in the ambot directory for now, because there’s nothing really complicated at the moment. But I think I get the idea about nesting modules. I left out a lot of the other common things like tests, requirements.txt, setup.py because I just want to get things built out right now. But the structure at least feels a little more sensible.

Actually, am having a little bit of trouble structuring the application to run via __main__.py and then also through the individual submodule scripts, but will worry about that later. Also definitely not sure I understand the purpose of __init__.py beyond the “defining a package/module” bit.

Separating markov.py and slackbot.py, and integrating a haiku bot

The goal here is just to separate out the slackbot functions from the markov chain generator functions in markov.py. For this first step will probably keep the slack data access inside the slackbot script, although I will set that up separately in the end.

  • Huh, so just getting the bot up again wasn’t too much of an issue. Don’t think it’s really hooked up to any of the scraped json messages I have or anything, but it is at least running and available.
  • So the next hurdle is figuring out how to best set up a haiku listener. Well, I guess first I’ll just run it in slackbot, but I think ultimately this will be the first bit of real OOP that should be done in this project.

TODO

  • https://stackabuse.com/a-sqlite-tutorial-with-python/
  • https://docs.quantifiedcode.com/python-anti-patterns
  • https://www.python.org/dev/peps/pep-0008/
  • http://as.ynchrono.us/2007/12/filesystem-structure-of-python-project_21.html
  • https://stackoverflow.com/questions/193161/what-is-the-best-project-structure-for-a-python-application
  • https://gehrcke.de/2014/02/distributing-a-python-command-line-application/
  • https://julien.danjou.info/starting-your-first-python-project/ (good notes on init.py usage)
  • https://automatetheboringstuff.com/chapter8/ (Needed to learn how to reference files relatively)
  • https://pfertyk.me/2016/11/automatically-respond-to-slack-messages-containing-specific-text/ (another example rundown of bot, plus webhooks approach)
  • https://medium.com/greenroom/the-slack-bot-tutorial-i-wish-existed-d53133f03b13 (ditto)

  • Models:
    • https://github.com/tlastowka/python-haikubot/blob/master/haiku.py

Haiku

Gold green blue and sweet:
through summer grass trees sky high
California breathes.