Building A State Of The Art AI With No Budget

Right now, I am not in the country, which means I don’t have access to my daily driver laptop. Instead, I have my Chromebook. That’s not a bad thing, it was £80 on Facebook marketplace, and it lets me code whilst I’m on the road, listen to music with Spotify, and game with GeForce Now. The setup works, although recording videos is a horrible process, so my YouTube channel is on pause for now.

But it brings me to a point. Aitrium has no funding, no customers, and no income streams (yet). Which means we are self funded. Which also means we can’t do things other startups do. A server full of Tesla T4s would be amazing, but there is no way we can afford that. Even a bunch of 2nd hand RTX 3060s (if we could get them) would be out of our price range. So we use Google Colab and a lot of hacky code.

We also can’t afford to use lots of 3rd party services. Every trendy conversational AI startup right now is using GPT-3. We did. We stopped after 3 days because it was very clear it was going to be far too expensive to be sustainable. We now use GPT-J on Colab for testing, and GPT-J over API for production through Goose.ai because they’re very affordable.

Deepfaking is another big aspect of our AIs, and rather than outsource it, we use Wav2Lip for 99% of our deepfakery, along with a bit of animation we get from a 3rd party. But we’re working on eliminating that cost too. As for the avatars themselves, we use ArtBreeder, which is a stunning site for creating photorealistic stills, and lets us use low-res images for free. Those get fed into the third party site, which animates them. We can bypass that and feed the still straight to Wav2Lip, but we don’t love the results, so a bit of animation gives the avatars a little more life.

TTS and Speech To Text are handled by Azure’s free services as of a couple of weeks ago, and before that we used Google TTS for our AIs voice, and a hacky version of some code we found for Speech To Text, which gave us basic Speech To Text. Azure just makes it easier, makes our code cleaner, and free is an appealing price. As is not giving money to Microsoft in exchange for services rendered.

There’s some other Open Source code we use to do a few bits and pieces behind the scenes, which I’m not willing to get into because the performance of Vanilla GPT-J and “enhanced with extras” GPT-J (yet not fine tuning it) is night and day, and we do need a bit of an edge!

Right now, because I can’t make videos, I want to write some articles about what we do behind the scenes at Aitrium. And I also want to write about where we want to go.

We’ve spent a year working really hard on making what we know is an excellent Conversational AI. We see what other companies are doing, and I’m not shy about saying I think we’re better.

Because what you see on YouTube and here is a fraction of what happens day to day. We have conversations we don’t want to publish because you’re not supposed to talk about that stuff in AI. Our AI does stuff you’re not supposed to try and make it do, let alone see it succeed.

But we’re also done hiding. We’re done being careful with what we say. And we know the “Big Beasts” of AI will mock us, deride us, and tell us we’re wrong. Go ahead. We’ll just sit here, posting videos and transcripts that consistently prove we’re not. To (heavily) paraphrase an advert from the 90s/00s – “We’re not going to behave like an AI company is expected to behave anymore.”

Author: David

Owner of AItrium.

One thought on “Building A State Of The Art AI With No Budget”

  1. I find it ironic open.ai still calls itself open.ai. I ran some tests on GPT-3 with their free $18 I think and I chewed up a few bucks for only testing it a few hours. Unfeasible to work with cost wise. I asked its conversational demo a few questions and got flagged as censored a few times lol. I dropped that quick smart. Censorship stifles discovery. 10 years from now our computing power will be more than enough to really push the envelope locally. Hopefully way sooner than that.
    GPT-J I think I can get running locally. I am thinking what if I came up with parallel systems that processed in parallel to refine functionality and speed up output using some predictive method or pre-processing in parallel. Hmm thinking out loud. Ahh I still have to save up some cash to get the RTX3090. Then there is the rest of the machine and 128 gig of ram and threadripper. A few months in my hardware build alone and then do it all again to parallel it all if tests are successful. Then another machine to plug in my own fast database handling other stuff like memories and task control for controllable hardware and internet interaction for specific tasks. Thanks for your post above it was very informative. Keep up the great work.

    Like

Leave a comment