| chatdownloader | ||
| processor | ||
| .env.example | ||
| .gitignore | ||
| justfile | ||
| LICENSE | ||
| README.md | ||
BinyotBot
Copying from the bot's Twitch about page:
A Markov chain Twitch bot that learns from chat messages. This is NOT AI. It randomly combines fragments from old messages dating back to 2019, including URLs and typos, creating incoherent sentences. Bot output doesn't reflect Vinny Vinesauce's views or opinions.
I was not planning for the source code to become public, so code quality/organisation leaves something to be desired. You should probably use this as a reference rather than trying to run the bot by yourself.
Unsolicited FAQs
Why is this process so manual?
I am running the bot on a Raspberry Pi, and the process of generating the model is CPU intensive. This is because the Markov chain model I'm using uses a simple weighted random selection algorithm to choose the next word of the sentence. This has many advantages, such as generating more "human-like" messages than systems without this capability, while also being able to reduce the incidence of slurs being included in the generated text. Most of those are automatically filtered by NightBot, but people can always get around them in creative ways. This is an O(n) operation that cannot be avoided (yes I even tried Fenwick/segment trees). My laptop should be able to handle this operation fine (shoutouts to SQLite btw, people should be using it for all kinds of stuff), and hence why I am delegating all the model building work to my laptop, while my Raspberry Pi handles the relatively less CPU intensive task of generating sentences. Moreover, I like having some control over the message ingestion process, such that I can identify messages and make tweaks to it, such as filtering out subscription messages.
Why are you downloading from ChatReplay when chat messages can be obtained live on stream?
This is actually how it used to work. However, the bot would randomly not connect properly to the stream and miss some crucial moments in chat. Hence, to make this simpler for myself (arguably more complicated), I just downloaded from ChatReplay. To avoid overloading their servers, I download chat messages in 1 second intervals, and most of the time it takes very few requests to download an entire stream's worth of messages. This also allows me to get messages from the distant past (back to 2019 when the first Vinny video is uploaded to ChatReplay). I also regularly contribute donations to help cover their hosting expenses.
Why aren't you containerising your stuff?
It runs on a RAM-limited Raspberry Pi, I need all the RAM I can get.
Why a Raspberry Pi?
Having to work within tight hardware forces me to write more efficient code and keeps things interesting. The whole setup runs off a single USB cable and costs practically nothing to operate.
Did you manipulate the bot to say stuff?
Yes I did, in three separate occasions actually. Once after the bot saying something about Kanye West, and I said something about "un-VIPing" the bot as I was uncomfortable with the bot being able to say anything it wants as a VIP, and it might reflect badly on Vinny as a streamer. I just joined the stream and wasn't aware that the bot was hooked up to a TTS. It's not something I've ever expected Vinny to do lmao. Hence, I was secretly glad that the bot was un-VIPed in the end, but for different reasons.
The other time was when Vinny asked the bot to say "micropennis", and I thought it would be funny to deactivate the bot, and only say "micropennis" when he was about to end stream. I mildly regret doing that as a bit, but it's still kinda amusing ngl.
The third instance happened some time ago, when I responded to someone through the bot. I don't recall the precise wording, but it was something like "you're not gonna believe this." This occurred during the bot's early period, and in my view, it doesn't really compare to the other two incidents I described.
Other than those moments, all messages are naturally generated. A lot of the times, it just coincidentally generates a sentence that fits the situation extremely well. Chat is extremely predictable with a regular rotation of common topics being discussed on stream and all. There's an inherent conflict between responding to events as they happen and sticking to a predetermined posting interval.
Here is where chatfinder comes in. Using that website, you can look up a chat
message and it will show you matching results along with the username of who
sent it. If you ever have any doubts as to whether a message generated is
genuine, choose any 3 word chunk from the generate sentence and see if that
chunk has ever been said by anyone in chat. If all of the chunks match, most
likely that message should be genuine. I could have manipulated the bot by
generating the sentence based on this rule, but at that point it is way too much
effort and I have better things to do with my life. The link to the chatfinder
website can be found in the bot's about page. You can also use it to identify
which chat messages and users contributed to generating a particular chat
message.
Does it occasionally reproduce a chatter's previous message word-for-word?
Yeah sometimes it does that, and that's a slight downside of my probability based system. That issue should improve as more training data is incorporated over time.
Would you stop the bot if Vinny asks you to?
Yes.
Why did you make this?
I made this bot during multiple weekends trying to keep myself busy while trying to find a j*b. During college, I was reading up on Markov chains, and I was familiar with LaimuBot and was curious as to what years of chat messages in your channel would look like if put through a Markov chain.. Now that I've acquired one I'm just maintaining this project in my free time, trying to make this project showable to the public.
Why are you being so secretive about your identity?
Idk I wasn't planning to run the bot long term, and at that point I decided to just keep my identity in chat as a secret. It's fun in a way. Anyone in chat could be me. I'm genuinely not interested in acquiring "fame" or attention out of those generated chat messages, I just think they are funny.