<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>astro&#39;s log</title>
    <link>https://paper.wf/astrob/</link>
    <description>A tiny plain-text log about my adventures on the Internet</description>
    <pubDate>Sun, 10 May 2026 12:39:49 +0000</pubDate>
    <item>
      <title>WhatsApp Chat Analysis - 21st May, 2024 </title>
      <link>https://paper.wf/astrob/21st-may-24-whatsapp-chat-analysis</link>
      <description>&lt;![CDATA[(Updated on 29/08/24. Added more context and the code!)&#xA;&#xA;I LOVE analyzing data. Deriving meaningful information data and visualizing it is something I absolutely enjoy doing. &#xA;&#xA;So one evening, I decided it would be fun to export a group chat, analyze it, and send back the results on the group. It was a hit, yay! Everyone loved it and their feedback and suggestions helped me incorporate more features.&#xA;&#xA;!--more--&#xA;&#xA;The analysis I did was VERY BASIC. And a lot of the code is  umm, sstolen/s borrowed from a dozens of tutorials and mixed it with some good ole elbow grease. Also it&#39;s worth noting that I CANNOT code; especially not in Python which is what I used. So it was a fun experiment. It&#39;s not something I technically coded. It&#39;s more like something I tried to code. It is extremely hacky .&#xA;&#xA;Wrote some python code to take an exported WhatsApp group chat (whatsappchat.txt) to search and split the text based on the sender of the message. Used #regex for this. I know that Python regex and ECMAScript (which is the one I&#39;m tiny bit familiar with) are different. &#xA;The knowledge however, did not prevent me from making silly errors that took forever for me to recognize :D &#xA; The whatsapporganizesplit.py uses regex to extract messages from the main whatsapp_chat.txt and saves the messages into individual text files named after the person in a folder called &#34;output&#34;.&#xA;&#xA;Used regex yet again to remove the timestamps. But before doing so, calculated the number of matches to find out the total number of messages. Saved this value. regex generator helped me to find an expression to match timestamps. &#xA;&#xA;Now, we have text files for every participant. These have no timestamps. I calculated total words for each sender. I used the following python libraries to further process data.&#xA; wordcloud &#xA; textblob &#xA; nltk&#xA; emoji&#xA;&#xA;Found out - &#xA; Most common words per person&#xA; Most common words for the entire group&#xA; Most common emojis per person&#xA; Most common emojis for the entire group&#xA; Sentiment / Polarity with simple sentiment analysis using textblob. I don&#39;t understand it at all but ticks all buzzwords which excited people on the group, so I&#39;m happy :D&#xA;&#xA;Saved all of these to text files.&#xA;&#xA;Code is on Github.&#xA;&#xA;hr]]&gt;</description>
      <content:encoded><![CDATA[<p>(Updated on 29/08/24. Added more context and the code!)</p>

<p>I LOVE analyzing data. Deriving meaningful information data and visualizing it is something I absolutely enjoy doing.</p>

<p>So one evening, I decided it would be fun to export a group chat, analyze it, and send back the results on the group. It was a hit, yay! Everyone loved it and their feedback and suggestions helped me incorporate more features.</p>



<p>The analysis I did was VERY BASIC. And a lot of the code is  umm, <s>stolen</s> borrowed from a dozens of tutorials and mixed it with some good ole elbow grease. Also it&#39;s worth noting that <em>I CANNOT code; especially not in Python</em> which is what I used. So it was a fun experiment. It&#39;s not something I technically coded. It&#39;s more like something I <em>tried</em> to code. It is extremely hacky .</p>
<ol><li><p>Wrote some python code to take an exported WhatsApp group chat (whatsapp_chat.txt) to search and split the text based on the sender of the message. Used <a href="/astrob/tag:regex" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">regex</span></a> for this. I <em>know</em> that Python regex and ECMAScript (which is the one I&#39;m tiny bit familiar with) are <em>different</em>.
The knowledge however, did not prevent me from making silly errors that took forever for me to recognize :D
The <code>whatsapp_organizesplit.py</code> uses regex to extract messages from the main <code>whatsapp_chat.txt</code> and saves the messages into individual text files named after the person in a folder called “output”.</p></li>

<li><p>Used regex yet again to remove the timestamps. But before doing so, calculated the number of matches to find out the total number of messages. Saved this value. <a href="https://regex-generator.olafneumann.org" rel="nofollow">regex generator</a> helped me to find an expression to match timestamps.</p></li>

<li><p>Now, we have text files for every participant. These have no timestamps. I calculated total words for each sender. I used the following python libraries to further process data.</p>
<ul><li>wordcloud</li>
<li>textblob</li>
<li>nltk</li>
<li>emoji</li></ul></li>

<li><p>Found out -</p>
<ul><li>Most common words per person</li>
<li>Most common words for the entire group</li>
<li>Most common emojis per person</li>
<li>Most common emojis for the entire group</li>
<li>Sentiment / Polarity with simple sentiment analysis using textblob. I don&#39;t understand it at all but ticks all buzzwords which excited people on the group, so I&#39;m happy :D</li></ul></li>

<li><p>Saved all of these to text files.</p></li></ol>

<p>Code is on <a href="https://github.com/corvusdeinanis/basic-whatsapp-chat-analysis" rel="nofollow">Github</a>.</p>

<hr>
]]></content:encoded>
      <guid>https://paper.wf/astrob/21st-may-24-whatsapp-chat-analysis</guid>
      <pubDate>Thu, 23 May 2024 19:53:21 +0000</pubDate>
    </item>
    <item>
      <title>RegEx Text Tool - March 26, 2024 - May 24th, 2024</title>
      <link>https://paper.wf/astrob/march-26-2024-may-24th-2024-regex-text-tool</link>
      <description>&lt;![CDATA[Started as a way to perform simple text functions - make all letters uppercase, make all letters lowercase, and search and replace. I then realized that I could also let users use #regex in the search and replace. &#xA;&#xA;!--more--&#xA;&#xA;Went through multiple iterations with the current version (1.0) being the first fully usable one imo. Finished it on May 24th, 2024. This is something I worked on during exams, in tiny breaks. &#xA;&#xA;This version can do the following: &#xA;find matches (words/phrases), highlight the matches, and indicate total matches&#xA;find matches (regex), highlight matches, count them&#xA;can replace / substitute using regex&#xA;accepts not only a regular expression but also flags&#xA;&#xA;(uses ECMAscript flavour of RegEx)&#xA;&#xA;UPDATE: Code is on GitHub.&#xA;&#xA;hr]]&gt;</description>
      <content:encoded><![CDATA[<p>Started as a way to perform simple text functions – make all letters uppercase, make all letters lowercase, and search and replace. I then realized that I could also let users use <a href="/astrob/tag:regex" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">regex</span></a> in the search and replace.</p>



<p>Went through multiple iterations with the current version (1.0) being the first fully usable one imo. Finished it on May 24th, 2024. This is something I worked on during exams, in tiny breaks.</p>

<p>This version can do the following:
– find matches (words/phrases), highlight the matches, and indicate total matches
– find matches (regex), highlight matches, count them
– can replace / substitute using regex
– accepts not only a regular expression but also flags</p>

<p>(uses ECMAscript flavour of RegEx)</p>

<p>UPDATE: Code is on <a href="https://github.com/corvusdeinanis/regex-match-and-replace-tool" rel="nofollow">GitHub</a>.</p>

<hr>
]]></content:encoded>
      <guid>https://paper.wf/astrob/march-26-2024-may-24th-2024-regex-text-tool</guid>
      <pubDate>Thu, 23 May 2024 19:47:11 +0000</pubDate>
    </item>
    <item>
      <title>toshi-pono&#39;s Poketch on my M5Core2! Feb 7, 2024</title>
      <link>https://paper.wf/astrob/toshi-ponos-poketch-on-my-m5core2</link>
      <description>&lt;![CDATA[Compiled toshi-pono&#39;s poketch  and successfully was able to push it to my M5Core2! &#xA;&#xA;It&#39;s beautiful. I love it so much. It has a digital clock, a pedometer, a coin toss, and a drawing slate thingy. JUST LIKE THE GAMES. &#xA;&#xA;!--more--&#xA;&#xA;Opened the .ino and hit compile on Arduino IDE. That&#39;s it!&#xA;&#xA;I&#39;m not sure why but the last time I tried it, it didn&#39;t work. I think I made some changes which fixed it. I replaced some libraries I think? Not sure if that&#39;s what helped because I was just messing around. &#xA;&#xA;I just added &#xA;static LGFX lcd;&#xA;static LGFX_Sprite sprite(&amp;lcd);&#xA;to the main.h. I don&#39;t even know if that&#39;s what made it work, or if I was just doing something wrong the first couple of times. &#xA;&#xA;My money is on the latter. I&#39;m too lazy to find out right now.&#xA;&#xA;hr]]&gt;</description>
      <content:encoded><![CDATA[<p>Compiled toshi-pono&#39;s <a href="https://github.com/toshi-pono/poketch" rel="nofollow">poketch</a>  and successfully was able to push it to my M5Core2!</p>

<p>It&#39;s beautiful. I love it so much. It has a digital clock, a pedometer, a coin toss, and a drawing slate thingy. JUST LIKE THE GAMES.</p>



<p>Opened the .ino and hit compile on Arduino IDE. That&#39;s it!</p>

<p>I&#39;m not sure why but the last time I tried it, it didn&#39;t work. I think I made some changes which fixed it. I replaced some libraries I think? Not sure if that&#39;s what helped because I was just messing around.</p>

<p>I just added</p>

<pre><code>static LGFX lcd;
static LGFX_Sprite sprite(&amp;lcd);
</code></pre>

<p>to the <code>main.h</code>. I don&#39;t even know if that&#39;s what made it work, or if I was just doing something wrong the first couple of times.</p>

<p>My money is on the latter. I&#39;m too lazy to find out right now.</p>

<hr>
]]></content:encoded>
      <guid>https://paper.wf/astrob/toshi-ponos-poketch-on-my-m5core2</guid>
      <pubDate>Tue, 06 Feb 2024 21:38:20 +0000</pubDate>
    </item>
    <item>
      <title>Zen Writing Website - January 31st 2024 </title>
      <link>https://paper.wf/astrob/january-31st-2024-zen-writing-website</link>
      <description>&lt;![CDATA[I cobbled together a website for writing without distractions. I know plenty such sites exist already, but still wanted to do it. The textbox does not accept backspace! No deleting, no editing!&#xA;!--more--&#xA;&#xA;Why?&#xA;One of the reasons why I struggle to finish writing is because I&#39;m constantly editing what I write. I know that editing is detrimental to writing. Get some stuff out there first. Get a block of marble first, and then you can chisel away. &#xA;&#xA;From my site, &#xA;&#xA;Uses&#xA;Can be used for journaling. I think this is a solid use case. Use it as a means of cathartic venting, void-shouting&#xA;Can be used to write without editing. Like I said earlier, irrespective of the kind of writing, content is a prerequisite. You can edit your first paragraph a billion times and get a decent one, but now you&#39;re burnt out and there&#39;s pages to write.&#xA;&#xA;Features&#xA;A word counter (that is slightly inaccurate but good enough)&#xA;The ability to save what you&#39;ve written as a text file&#xA;Clean, clutter-free. Distraction free writing&#xA;&#xA;Design&#xA;It is a minimal website that uses water.css, an open-source super-simple CSS framework! This makes the site look super clean and professional with zero effort from me. &#xA;&#xA;Code&#xA;It&#39;s open source, of course! I should host it somewhere but until then, you can take a look / download the code from this Codepen link&#xA;&#xA;UPDATE: It&#39;s now on GitHub. &#xA;&#xA;hr]]&gt;</description>
      <content:encoded><![CDATA[<p>I cobbled together a website for writing without distractions. I know plenty such sites exist already, but still wanted to do it. The textbox does not accept backspace! No deleting, no editing!
</p>

<h2 id="why" id="why">Why?</h2>

<p>One of the reasons why I struggle to finish writing is because I&#39;m constantly editing what I write. I know that editing is detrimental to writing. Get some stuff out there first. Get a block of marble first, and then you can chisel away.</p>

<p>From my site,
<code>This tiny web thingamajig is clean, no clutter, and lets you type without backspacing. Write first, edit later! Made a typo? That&#39;s okay, just keep moving forward. You can always fix it later. Write in a flow. Get the text out of your system and refine it later.</code></p>

<h2 id="uses" id="uses">Uses</h2>

<p>Can be used for journaling. I think this is a solid use case. Use it as a means of cathartic venting, void-shouting
Can be used to write without editing. Like I said earlier, irrespective of the kind of writing, content is a prerequisite. You can edit your first paragraph a billion times and get a decent one, but now you&#39;re burnt out and there&#39;s pages to write.</p>

<h2 id="features" id="features">Features</h2>

<p>A word counter (that is slightly inaccurate but good enough)
The ability to save what you&#39;ve written as a text file
Clean, clutter-free. Distraction free writing</p>

<h2 id="design" id="design">Design</h2>

<p>It is a minimal website that uses <a href="https://watercss.kognise.dev/" rel="nofollow">water.css</a>, an open-source super-simple CSS framework! This makes the site look super clean and professional with zero effort from me.</p>

<h2 id="code" id="code">Code</h2>

<p>It&#39;s open source, of course! I should host it somewhere but until then, you can take a look / download the code from <a href="https://codepen.io/theorigins/details/rNRJZJG" rel="nofollow">this Codepen link</a></p>

<p>UPDATE: It&#39;s now on <a href="https://github.com/corvusdeinanis/timer-zen-writing-app" rel="nofollow">GitHub</a>.</p>

<hr>
]]></content:encoded>
      <guid>https://paper.wf/astrob/january-31st-2024-zen-writing-website</guid>
      <pubDate>Tue, 06 Feb 2024 21:34:23 +0000</pubDate>
    </item>
    <item>
      <title>Adventures in OCR and PDFs - January 21, 2024</title>
      <link>https://paper.wf/astrob/decided-to-read-frederick-coplestons-history-of-philosophy</link>
      <description>&lt;![CDATA[Decided to read Frederick Copleston&#39;s History of Philosophy. It is harder than usual to obtain an ebook version of this text. There is a scanned version on the Internet Archive, but it has couple of volumes missing&#xA;&#xA;!--more--&#xA;&#xA;I managed to find a scan of the first volume but as expected, it consists of images. The image format is jbig2 which is common for scanners. Tried to use OCR and extract text from the PDF using Tesseract. Tesseract doesn&#39;t support jbig2 format, so installed ocrmypdf. Thanks to Chocolatey, this was super simple. &#xA;&#xA;After about 15ish minutes (my sense of time is wack so it could have been much more, or less), I had a PDF that I could highlight and index! I also &#xA;got a text file with the extracted text. That&#39;s a win!&#xA;&#xA;hr]]&gt;</description>
      <content:encoded><![CDATA[<p>Decided to read Frederick Copleston&#39;s History of Philosophy. It is harder than usual to obtain an ebook version of this text. There is a scanned version on the Internet Archive, but it has couple of volumes missing</p>



<p>I managed to find a scan of the first volume but as expected, it consists of images. The image format is jbig2 which is common for scanners. Tried to use OCR and extract text from the PDF using <a href="https://github.com/tesseract-ocr/tesseract" rel="nofollow">Tesseract</a>. Tesseract doesn&#39;t support jbig2 format, so installed <a href="https://github.com/ocrmypdf/OCRmyPDF" rel="nofollow">ocrmypdf</a>. Thanks to Chocolatey, this was super simple.</p>

<p>After about 15ish minutes (my sense of time is wack so it could have been much more, or less), I had a PDF that I could highlight and index! I also
got a text file with the extracted text. That&#39;s a win!</p>

<hr>
]]></content:encoded>
      <guid>https://paper.wf/astrob/decided-to-read-frederick-coplestons-history-of-philosophy</guid>
      <pubDate>Sat, 20 Jan 2024 21:54:46 +0000</pubDate>
    </item>
  </channel>
</rss>