Cred Lab x HCI Lab: Credbot | Emilie Zhang S Portf

Screen Shot 2023-12-24 at 10.19_edited.j

Credbot

Wellesley HCI Lab x Credibility Lab, Prof. Orit Shaer & Prof. Eni Mustafaraj

Developed a chatbot in the form of a chrome extension using AI LLMs to check website credibility; paper in progress

My Role

Project Manager; Software Developer and AI integration; UI Designer; UX researcher

Team

Emilie Zhang, Wellesley College

Rita Lyu, Wellesley College

Lan Dau, Wellesley College

Nina Howley, Wellesley College

Dianna Gonzalez, Wellesley College

Timeline

September - December 2023

Tools

Javascript

HTML & CSS

OpenAI API (GPT-4)

Figma

R for quantitative analysis

There are many reasons why a website can be untrustworthy, yet many of these criteria are not transparent. Furthermore, if the credibility checking tool relies on the user proactively entering a website they find suspicious, all websites users falsely deem trustworthy escape scrutiny.

Jointly developed under Prof. Orit Shaer's Human-Computer Interaction (HCI) Lab and Prof. Eni Mustafaraj's Credibility Lab, we created a conversational user interface Chrome extension using LLMs that automates the credibility checking process as a latent assistant, called Credbot. We also designed and conducted a user study to discover insights into the tool’s effectiveness.

My Role:

As the first software developer, I iteratively programmed the entire back and frontend of our first iteration in 3-4 months
Became project manager of the team, leading and coordinating
Created first iteration of UI & conducted extensive user studies
Jointly authored academic paper, currently in the publishing process

Background Research: Credibility Lab

Screen Shot 2023-12-24 at 11.04.26 AM.png

Figure 1: A screenshot of the Google search page results for
CNN. It contains (1) the Knowledge Panel, (2) a section on
top stories; and (3) a section of recent tweets. The panel "Topics they write about" contains stories from "2 weeks ago", indicating that topics are not “discovered” on the fly

Image and caption citation:

Investigating the Effects of Google's Search Engine Result Page in Evaluating the Credibility of Online News Sources, Emma Lurie & Eni Mustafaraj

After the 2016 election, Prof. Mustafaraj started a new research project to understand how users make decisions about what online sources to trust. She found surprisingly that "young users are not particularly skilled in assessing the credibility of online content," and that there were "widespread inconsistencies in the coverage and quality of information" in the information presented by Google (Lurie, Mustafaraj). The project proposal, "Signals for evaluating the credibility of web sources and advancing web literacy", received an NSF CAREER grant (2018-2023).

Prof. Mustafaraj's Credibility Lab sought to discover whether and how humans can reclaim their agency when deciding what information sources to trust. The goals were to:

identify a set of human-understandable signals that are deemed helpful for evaluating the credibility of web sources and validate them through users studies;
propose and implement algorithmic techniques for computing some of these signals, providing a trail of transparency about how they work, and
develop a novel web platform for the interactive exploration of signals, modeled upon nutrition fact labels, that will contribute in advancing web literacy skills in the broad public.

HCI Lab: Credbot

Using the credibility signals developed in goal 1), we sought to conduct artifact research and advance goal 2) under the guidance of Prof. Orit Shaer and her Human-Computer-Interaction (HCI) Lab in the course CS366, Advanced Projects in Interactive Media. In our group of three, we successfully created and implemented a working model of Credbot, explored and iterated over designs, considered ethical implications, conducted evaluative studies, and reported our findings in our paper (in progress).

Screen Shot 2023-12-07 at 12.22.29 PM.png

Figure 2: Example of Credbot assessing the credibility of a webpage. The yellow banner indicates medium credibility, The user types and asks Credbot to "summarize this article" (in green). The following paragraph of text in gray is Credbot's answer.

Problems:

With factors such as biased perspectives or deeply embedded ads that undermine the reliability and trustworthiness of its content, it is difficult to discern website credibility. The average web user often expects a streamlined search query process, but existing methods of verifying a website’s credibility are cumbersome and only employed at the user’s discretion [8]. Because many users don’t have a firm understanding on what makes a website credible, if the credibility checking tool relies on the user proactively entering a website they find suspicious, all websites users falsely deem trustworthy escape scrutiny.

Our Solution:

To solve these issues, we propose a conversational user interface Chrome extension that automates and customizes the credibility-checking process as a latent assistant.

CredBot integrates the OpenAI API into a chatbot chrome extension to:

primarily detect when users are on websites with low credibility, and provide credibility warnings to deter the consumption of misinformation
detect when users are on websites that attempt to give medical information (or later, other topics prone to misinformation), and provide context warnings
converse with the user and provide the reasoning behind warnings or the credibility measures chosen based on Wellesley’s CredLab's research, and increase critical thinking & web literacy skills for users to think for themselves instead of relying solely on our extension

My Contributions:

1. Software Development

Figure 3: Sample low fidelity sketch

Screen Shot 2023-12-25 at 10.19.40 AM.png

Figure 4: Sample interaction flow with Credbot

HTML, CSS, and JavaScript Implementation

Because of Javascript being the the standard for developing Chrome extensions, I iteratively developed the chatbot with Javascript as the backbone. It enables dynamic content manipulation and user interaction handling within the browser environment. My scripts chatbot4.js and contentScript.js primarily use JavaScript for scripting behavior, handling events, and interacting with web pages, and are responsible for essential functionalities in the extension such as responding to user actions, manipulating web page content, and communicating with external APIs like OpenAI's GPT-4. For the structure of the chatbot interface, I used HTML in conjunction with CSS following the format of our low-fidelity sketch.

API Integration with OpenAI's GPT-4 & Iterations
For Credbot to have advanced natural language processing and machine learning capabilities, I integrated the extension with OpenAI's GPT-4 API. This integration allows the chatbot to generate intelligent, context-aware responses and conduct credible website evaluations. Rapid iteration was also needed for the integration of GPT-4 to have the correct capabilities for Credbot, espcially with GPT-4's limitations.

Screen Shot 2023-12-21 at 1.15.46 PM.png

Iteration 1

Iteration 2

Iteration 3

Figure 5: my Iteration process for the software

The first step I had to do was ensure that OpenAI’s API was successfully connected to our chrome extension, and that it was generating context-aware responses through GPT-4’s framework. I was able achieve this step relatively quickly in the first iteration of Figure 3.

However, because the public version of OpenAI's API, including ChatGPT, does not currently have the capability to browse the internet, this version of the chatbot did not know the URL of the website it was on. Therefore, for the second iteration, I retrieved the URL of the website the user was currently on and passed it as a user prompt. Because OpenAI’s public API cannot perform live searches and, therefore, cannot access internet contents from a URL, the inner text of the web page also had to be passed to our chatbot in order for it to perform analyses on its contents. To do this, I accessed the innerText of each webpage in Javascript and passed it as another user prompt to GPT-4.

By the third iteration shown in Figure 5, Credbot was able to 1) automatically know what website the user was on, and 2) know the contents of the webpage to begin credibility analysis. These iterations demonstrate how I explored and adapted to GPT-4's capabilities in API integration. While many more iterations were needed to set up the core functionality of Credbot, those additions were more in line with traditional software & extension development.

Browser Extension Framework
To ensure my code could work as a chrome extension, I followed the Manifest V3 structure in my manifest.json file, which is tailored for Chrome browsers. This structure governs my extension's permissions, ensuring secure access to necessary resources like active tabs and external APIs. It also manages the injection of my scripts and resources into web pages, facilitating the seamless integration of the chatbot into the user's browsing experience without disrupting the functionality of the web pages.

2. System Prompting

Screen Shot 2023-12-25 at 11.30.55 AM.png

Figure 6: Example of our system prompt for one iteration

While I was solely responsible for software development, the system prompting of GPT-4 was a collaborative process amongst my teammates. Prompt engineering, in the context of this chatbot's integration with OpenAI's GPT-4 API, was a critical aspect of the technology overview. Five credibility signals were chosen via Prof. Mustafaraj's Credibility Lab, and were as follows: disclosure of authorship, disclosure of ownership, volume of ads, volume of promoted content, and organization type (non-profit, for-profit, governmental etc.). These were passed to GPT-4 as a system prompt, along with other instructions.

3. User Studies & Thematic Analysis

Screen Shot 2023-12-25 at 11.34.43 AM.png

Figure 7: Example of our Figma prototype

Figure 8: Thematic Analysis

We conducted an empirical pilot study of 20 participants ages 18-24 varying in gender, sex, and concentration of study, though skewing towards female and CS or CS-adjacent students. During a 45 minute study session with a researcher, users visited two sets of four websites on a Google search results page Figma prototype that led to normal Google websites with the Credbot extension on.

Based on observed interactions and surveys, we identified two major areas of want users have for CredBot: education and assistance (Figure 8). As an on-demand education tool, upon CredBot’s initial credibility rating, users' subsequent prompts frequently indicated a want for CredBot to define specific signal meanings and elaborate on the signal’s contextual application to a given webpage, and/or the signals’ relevance to website credibility. Given CredBot’s responses, participants would express higher trust in both the chatbot’s credibility rating, as well as confidence in their own initial impressions of a given domain if it aligned with CredBot’s. To read more about qualitative and quantitative analysis, click here.

Emilie Zhang's Portfolio