note: There is likely to be a lot of typos and poor grammer, this post it yet to be proofread.
Introduction
Algorithmic bias are preferences for a certain results, and can be built into the algorithms. For some applications algorithms can havetendency built into them due to the nature of computers howeverand more important is when they are intentionally added to serve the benefit of a specic group usually at the expense of others. I wanted to look into Neocities to try and see if they have a bias, such as my question became.
What are the qualities of the websites that are shown on the 'Speical Sauce' filter on Neocities. Do they represent the population and what are requirements?
To answer this question I decide to conduct my own research by gathering the public stats from the website. I wanted to create the poster as a fast way to show my results. However it the poster did not come out as nicely As I would have liked compared to the previous one. I will be putting most of the text and images else where so that they are more readable and nicer to look at.
Below is the poster, and then I will include the conclusion and graphs afterward. Also sorry for the typo, I'm in a bit of hurry.
Digital Poster
Note: Yay, because I couldn't export the pdf properly I had to capture the images with printscreen, so resolution will be bad =(, I've appened the results below.
Poster Conclusion
Algorthmic bais can aect who get priority can often work to the advantage of a specic group. I complete a small analysis on the 'Special Sauce' page of Neocities to see if there was a preference for a certain kind of website and whether it represented the population.
I found that the website displayed are all very diversity in terms of stats, and tags. Views ranged from 32,452 to 129,568,850 views with an average of 2,767,216.39 views and standard deviation of 14,184,758. our 95% condence interval was (-311,064.80 , 5845497.59), however you can't have negative views, it means the number of views you have is not a very good predictor to wether you'll be shown in 'Special Sauce'

Note: Tablulated results for the views, followers, updates, last updated, and creation date. The none date values in the table are in days, for example the margin of eror is 168.87 days.
From the Pair T-Test, the results where not below the critical value (p = 0.087855, alpha = 0.05,) so we fail to reject the null hypothesis. I cannot say that the the 'Special Sauce' website come from a dierent population. However I do want to point out that the personal tag was heavy over represented in our sample, however I cannot say that this not purely random. overall I can condently say that Neocities does not appear to be promoted any one sort of website more then others in terms of tags. The website that are shared in 'Special Sauce' tend to have over 10,000 views but the actual range is very large and not very normal. It is conclusive that all website shared have been updated within at most 2 days,
Data Analysis
I collected data from the website profiles (n = 84) and used a confidence interval of 95% (alpha = 0.05). From each website I collected the number of Views, Followers, Updates, Date Created, and time since last updated in days. From there I created a duplicate dataset with outliers removed using thompson tau test, and ended up removing around 3/4 of the data (n = 23). To determine if certain tags where more being over represented I did a paired T test (null hypothesis: mu = 0). I compared my sample data with the an estimate of the population based on the number of pages for each tag, and adjusted to be normalized.


Note: Setuping up check back later.


Note: Setuping up check back later.
To determine Tag Bias I first removed the first removed al labels that appeared less then 4 times (new n = 18). Then I Adjuest them with (X = X / MAX(Sample) ). Each page has 100 website by filtering by tag I could estimate the amount for each tag. Taking the difference between the sample and population for Paired T-Test.


Note: Setuping up check back later.
From the data, collect we can see that for raw data was not very with exception to the Date Created. This means that there is a wide distrituion of website that have a lot of views, followers, and updates. However It is import to note that, I only look at 84 website, Neocities hosts over 1 million sites, so the sample represent less then 0.01%.
Afterword / Further Reading
As mentioned in the poster / anaylsis, I used a dataset of (n=84) is not bad for a sample however, for such a large population it very possible that my data is skewed. For a better results I would try to use some script to automatically collect the data. Another thing I forgot to mention is that All the data was collected over the course to two different day that were not consecutive. This could have also had an effect on the data.
Neocities is an opensource project which is a good sign, that it is unlikely to have a bias. however even tho my results found that it was not bias, my p value was pretty low around 0.08. Maybe in the future it would be better to compare this to another provider that is likely to be bias. Perhaps something to do with Canva, or other services that advertise premuim aspect of their website.
Sadly I don't have a any future reading for this ATM, I'd like to comeback to this post. Lastly I want to say that this will be the last post from my DTC 475 class. I did really enjoy the course and might keep doing something similiar out of my own interest in the future. I did learn a lot about making posters and research. This website was a long time in the making but it was born from the urgency of the DTC class. Thank you for reading. =)