Another Unused Blog

Back

Projects for the upcoming Holiday break 2024-11-27:

With a couple days of extra break for Thanksgiving, I’ve been pondering what I should work on next. My adventures in C++ tend to be frustrating so I think I’ll give it a rest and spend some time working on the blog. That leaves the problem of what else I’d like to see the blog do. One problem that I’ve been having with uploading media to this blog is that some devices automatically save to Apple’s mov format vs the more universal mp4. I tried in the past to implement an automatic conversion on the web server itself. I got this working but it isn’t build for larger files, and it’s far to heavy a process for a web server. This morning I was thinking about using AWS Simple Queue service for this task, but I suspect that using that service would be overkill considering how simple it would be to write a batch job that ran on a container whose sole job would be to check a folder for mov files, convert them, and move them to the production bucket. I never got around to increasing the logging that I am doing on the blog too so this would also be a good opportunity for a bit of refactoring as well as refining of my existing processes and logging.

Implementing WatchTower in python flask 2024-11-10:

When implementing permissions for AWS ec2 instances, the way to go is IAM roles. Most of the time this is fine, sometimes it can be a pain. Please ignore anyone whom tells you to hard code AWS cli access creds into your code. To get started implementing WatchTower to ship my flask logs into aws I added the watchtower library to my project and then imported watchtower.
The actual code to implement WatchTower in my code:

# Configure the Flask logger
logger = logging.getLogger(__name__)
cloud_watch_stream_name = "vacuum_flask_log_{0}_{1}".format(platform.node(),timeobj.strftime("%Y%m%d%H%M%S"))
cloudwatch_handler = CloudWatchLogHandler(
    log_group_name='vacuum_flask',  # Replace with your desired log group name
    stream_name=cloud_watch_stream_name,  # Replace with a stream name
)
app.logger.addHandler(cloudwatch_handler)
app.logger.setLevel(logging.INFO)

IAM permissions required

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogStreams"
            ],
            "Resource": "*"
        }
    ]
}

Finishing touches

The last thing that proved to be an issue was that boto3 couldn’t find the default region in my containers. This has come up before but today was I was able to find a way around it by adding a default aws cli config file to my deployment and telling boto3 where to find it by using the environment variable AWS_CONFIG_FILE
cloudwatch example logs

Taxonomy, tagging, and topics oh my 2024-10-20:

This weekend I decided that I wanted an easier way to look up things I had written by the topics I’m writing about. I’m a fan of normalized data so I started by creating a couple tables. One for the tags aka topics that I’ve written about, and one for a link between my posts and the topics. I was day-dreaming the user interface for these topics and had started muddling through creating some endpoints when it occurred to me that I already had topics that I had applied to the posts in the way of a string of topics used for SEO (search engine optimization). With this end mind I decided to write a script whose job would be to loop through all of my blog posts and process the tags/topics one by one. This script would create the tag/topic if it didn’t exist and then create the link between the posts and the tags. After debugging that process it occurred to me that a similar process could be used so that the user interface for maintaining the tabs could be the existing blog post form, as is, without any changes. This could be accomplished by taking the existing seo data, spiting it by commas and then processing the topics/tags one at a time. All that was left was code to clean up anything that was removed when editing posts.
keywords

multi server deployments, health checks, edit links oh my 2024-10-13:

This weekend, we took vacuum flask multi-server.

Redis for shared server session state
Added health checks to the apache load balancer
Added EFS, docker, and the required networking to the mail server.
Fixed issue with deployment pipeline not refreshing image
Limited main page to 8 posts and added links at the bottom to everything else so it will still get indexed.

Raphael gets threading for a flask web server 2024-09-07:

The goal of today’s development session was to create a web source for Raphael bot’s closed captions. The existing solution created/updated labels in obs studio. The problem with using labels was that the formatting options were limited and if a sentence ran long, the label would run off the screen. After some trial and error with async / await and flask I ended up trying out the python3 threading module. Threading was exactly what I needed, although with flask running in a thread, you can’t debug it on the main Raphael process, but that is to be expected.


def listen_local(self):
        loop = asyncio.get_event_loop()
        print("Firing off Flask")

        thread1 = threading.Thread(target=lambda: self.myWebServ.run(
            host=self.config_data["flask_host_ip"], port=self.config_data["flask_host_port"],
            use_reloader=False, debug=True), args=()).start()
        #self.myWebServ.run()
        self.myWebServ_up = True
        print("Listening to local Mic")
        thread2 = threading.Thread(loop.run_until_complete(self.basic_transcribe()))

        #loop.run_until_complete(self.basic_transcribe())
        #loop.run_until_complete(self.launch_flask_server())


        loop.close()

new blog engine now live 9/1/2024:

Welp, the “new” blog is live. It’s running on python 3.12 with the latest build of flask and waitress. This version of my blog is containerized and actually uses a database. A very basic concept I know. I considered running the blog on the elastic container service and also as an elastic beanstalk app. The problem with both of those is that I don’t really need the extra capacity and I already have reserved instances purchased for use with my existing ec2 instances. I’m not sure how well flask works with multiple nodes, I may have to play around with that for resiliency sake. For now we are using apache2 as a reverse https proxy with everything hosted on my project box.

Todo items: SEO for posts, RSS for syndication and site map, fixing s3 access running from a docker container. Everything else should be working. There is also a sorting issue of the blog posts that I need to work out.

One Time secret clone and dockerization 8/5/2024:

This weekend started by working on a “one time secret” clone for personal use. ChatGPT got me 80% of the way there and I spent the rest of the day tweaking and improving upon the code it wrote. Sunday I set about getting the v2 of my blog production ready by introducing waitress which is the recommended method for hosting python flask web applications and then containerizing the application so that it’s easy to stay on the latest fully supported version of python which updates frequently. Cool scripts from this past weekend:

docker file:


FROM public.ecr.aws/docker/library/python:3.12

WORKDIR /tmp
# Add sample application
ADD app.py /tmp/app.py
ADD objects.py /tmp/objects.py
ADD hash.py /tmp/hash.py

COPY templates /tmp/templates
COPY static /tmp/static


COPY requirements.txt requirements.txt

RUN pip3 install -r requirements.txt

EXPOSE 8080

# Run it
CMD [ "waitress-serve", "app:app" ]

build image and run with shared folder:


#!/bin/bash

docker build --tag vacuumflask .
imageid=$(docker image ls | grep -w "vacuumflask" | awk '{print $3}')
docker run  --env-file "/home/colin/python/blog2/vacuumflask/.env" \
--volume /home/colin/python/blog2/vacuumflask/data:/tmp/data \
-p 8080:8080 \
 "$imageid"

Build beanstalk zip:


#!/bin/bash
destination="beanstalk/"
zipPrefix="vacuumflask-"
zipPostfix=$(date '+%Y%m%d')
zipFileName="$zipPrefix$zipPostfix.zip"
mkdir "$destination"
cp -a templates/. "$destination/templates"
cp -a static/. "$destination/static"
cp app.py "$destination"
cp Dockerfile "$destination"
cp hash.py "$destination"
cp objects.py "$destination"
cp requirements.txt "$destination"
cd "$destination"
zip -r "../$zipFileName" "."
cd ../
rm -r "$destination"
scp "$zipFileName" project:blog2
scp docker-build-run.sh project:blog2
ssh project

Next weekend I’ll need to figure out how to get it working with elastic-beanstalk and then work on feature parity.

Syncing data from the old blog to the new blog 7/28/2024:

This morning I automated my data sync between the old blog and the data storage system for the new one. This will allow me to keep up on how my newer posts will look on the pages I’m building as I slowly replace the existing functionality.


#!/bin/bash
# copy the files from my project box to a local data folder
scp -r project:/var/www/blog/blogdata/ /home/colin/python/blog2/vacuumflask/data/
# read the blog.yml file and export the ids, then remove extra --- values from stream
# and store the ids in a file called blog_ids.txt
yq eval '.id' data/blogdata/blog.yml | sed '/^---$/d' > data/blogdata/blog_ids.txt
# loop through the blog ids and query the sqlite3 database and check and see if they exist
# if they do not exist run the old_blog_loader pythong script to insert the missing record.
while IFS= read -r id
do
  result=$(sqlite3 data/vacuumflask.db "select id from post where old_id='$id';")
  if [ -z "$result" ]; then
    python3 old_blog_loader.py data/blogdata/blog.yml data/vacuumflask.db "$id"
  fi
done < data/blogdata/blog_ids.txt
# clean up blog ids file as it is no longer needed
rm data/blogdata/blog_ids.txt
echo "Done"

flask templates and orm updates 7/27/2024:

After getting chores done for the day I set about working on my new blog engine. This started out with getting flask templates working and after some back and forth that was sorted out. It then set in that I was repeating myself a lot because I skipped an ORM model. So I set about to write a blog class to handle loading, serialization, updates, and inserts. A round of testing later and a bunch of bugs were squashed. A side quest today was to update all of the image paths from prior blog posts to use my CDN. I ended up using a combination of awk commands [ awk '{print $7}' images.txt > just_images.txt and awk -F '/' '{print $3}' image_names.txt > images2.txt] to get a good list of images to push to the CDN and then asked chatgpt to help me write a bash loop [ while IFS= read -r file; do aws s3 cp "$file" s3://cmh.sh; done < images2.txt ] to copy all the images. bash loop using chatgpt I’ve seen one of my more Linux savvy coworkers write these loops on the fly and it is always impressive. I streamed the first couple hours of development and encountered a number of bugs with Raphael bot that I’ll see about fixing tomorrow.