Academia.edu ・ 2024
AI-powered personalization
From hackathon proposal to 53% lift in engagement
1 Product manager
2 Engineers
1 Data analyst
1 Designer (me!)
Product strategy
Product design
A/B testing & experimentation
Cross-functional leadership
AI/LLM integration
Overview
I identified an opportunity the team was overlooking: we were sitting on a ton of user data but not using it. Through a hackathon proposal, rapid prototyping, and rigorous experimentation, I transformed unused behavioral data into an AI-powered feature that drove a 53% lift in engagement.
My hackathon proposal
Connecting three things
I wrote a hackathon proposal that connected:
Behavioral signals we already had from our users: paper downloads and reading history
LLM capabilities to analyze that history and suggest relevant research topics
Minimal UX friction by boiling it down to what users actually need
There were over 120 proposals submitted. Mine was one of 10 that got accepted.
The Problem
We had the data but weren't using it
We already collected papers users viewed, papers they downloaded, their search queries, and their reading patterns. But we weren't leveraging this data to improve the user experience. New and returning users faced the same generic interface, missing opportunities for:
Personalized onboarding that reflects research interests
Behavioral-based suggestions that feel relevant
Effortless input experiences that reduce friction
The challenge: How might we use existing data to create personalized experiences without adding cognitive load or requiring users to fill out blank text fields?
The Solution: AI-Powered Personalization
Instead of asking users to describe their research interests in a blank field (which is cognitively demanding), I used an LLM to analyze their recent downloads and suggest topics.
How it works
Users see: "Hi, Shelly. What are you researching?" followed by four suggested topic chips.
Key design decisions:
Simple question reduces cognitive load. It's conversational and approachable.
Simple question reduces cognitive load. It's conversational and approachable.
LLM suggests topics based on recent downloads, so it feels personalized and accurate.
Easy interactions: One tap to select a suggestion, or type your own custom input.
Immediate feedback with a curated personal feed that reflects what you're researching.
No blank text box anxiety. No wondering if you're describing your work correctly.
The AI layer (invisible to users)
We used a general-purpose LLM via API. We didn't fine-tune a custom model. The model took lightweight behavioral signals like downloaded paper titles and abstracts as input, and generated a few concise topic suggestions.
I worked with our data scientist to co-design the prompt. We tested prompt temperature and phrasing to balance accuracy, diversity, and cost (since every token adds up). For example, longer prompts improved precision but doubled API cost, so we optimized by truncating paper metadata and capping suggestions at four topics per user.
From a user perspective, the AI layer was invisible. The output looked like a friendly, human-curated suggestion.
Three Experiments
Once we had the prototype, I designed three experiments to test what actually works.
Experiment 1: Text only
No AI, just a blank text box.
Result: 5.64% click-through rate and a 30% lift in feed engagement.
Experiment 2: LLM with long suggestions
The AI analyzed user behavior and suggested things like "Philosophy of aesthetics and art criticism" and "Aesthetic theory and the nature of art." Very academic, very accurate.
Result: 4.32% CTR. It underperformed.
Why: The long academic phrasing created friction. Users had to read and parse complex phrases.
Experiment 3: LLM with short, refined suggestions
Same AI technology, but I refined the UX writing to be concise: "Philosophy of art," "Aesthetic theory," "Art criticism."
Result: 7.23% CTR and a 53% lift in feed engagement.
Key insight
The same LLM technology performed better with refined UX writing. It wasn't about the AI being smarter. It was about how I presented the AI's output to users.
The AI was the same, but when the phrasing was long and academic, people ignored it. When we rewrote it in shorter, clearer language, we saw better results.
It showed me that designing for AI isn't just about the prompt behind the scenes. It's about how the product talks to people.
The Winning Experiment
Experiment 1: Text only
When a user lands on the homepage, they see "Hi, Shelly. What are you researching?" with four concise topic chips below. One tap and they're in. Or they can type their own.
They immediately see a personalized feed with:
Papers and discussions related to those topics
Posts from people in their field
Academics working on similar topics
Related papers in their reading list
The entire experience takes 5 seconds, but it fundamentally changes their relationship with the platform. Instead of a generic feed, they're seeing content curated to their actual research interests.
Process: Six Weeks from Idea to Impact
Result: 53% lift in feed engagement. This became the foundation for the team's OKRs for the next quarter. What started as a side project became a company priority.
Cross-Functional Leadership
To bring this from prototype to production, I had to lead cross-functionally across three teams.
With Data Science
I co-designed the LLM prompts. This wasn't just "here's the design, make it work." We iterated together to balance accuracy versus cost (running LLMs isn't free), and we defined fallback logic for edge cases.
With Product
I aligned on success metrics before we built anything. What does success look like? We designed the experiment structure together and I built the business case for why this was worth the investment.
With Engineering
I scoped technical feasibility. What can we actually build in the timeline? And I set up tracking so we could learn from user behavior and improve future suggestions. This wasn't just analytics. It was a learning system.
Measuring AI quality
We also set up simple ways to measure whether the AI was actually doing a good job. Instead of only tracking clicks or engagement, we looked at how often people chose one of the AI's suggestions versus ignored or changed it. That helped us understand whether the suggestions felt accurate and trustworthy from a user's point of view.
Later, we ran a small test that added a short label like "Suggested based on your recent downloads." It didn't really increase clicks, but users said it made them feel more confident about why they were seeing those topics. So we kept that as a pattern for future AI features.
Future Iterations
There's still room to improve. Potential future iterations include:
Progressive disclosure
Show suggestions only after the user starts typing, rather than immediately. This might feel less pushy.
Enhanced transparency
Test "Based on [specific paper title]" versus generic labels. Does showing the source increase trust?
Multi-interest profiling
Allow users to add multiple research areas, since most academics work across domains.
Qualitative research
I want to understand why users select, edit, or reject suggestions. The quantitative data tells us what happened, but not why.
Cost optimization
Refine LLM prompts to reduce API calls while maintaining accuracy. Every call costs money.
These aren't just nice-to-haves. They're about making the feature sustainable long-term.
Impact & Outcomes
Quantitative results
53% lift in feed engagement
Users interacted more with personalized content.
7.23% CTR on suggestions
High adoption of AI-generated topics.
6 weeks from idea to production
Rapid validation and iteration.
Strategic impact
The impact went beyond just the metrics:
Shifted team thinking from "ask users" to "infer from behavior." We proved that behavioral inference works better than self-reporting.
Foundation for next quarter's OKRs. The team built their roadmap around this.
Template for future AI features across the product. Now when the team pitches AI ideas, they reference this project.
Established design patterns for LLM-powered suggestions that other designers could use.
Validated the ROI of designer-led initiatives. I proposed this, prototyped it, designed the experiments, and drove it to shipping. It showed that designers can lead product strategy, not just execute on requirements.
What I learned
Designing for AI isn't just about the prompt behind the scenes
It's about how the product talks to people. The same AI technology performed dramatically differently based on how I presented it to users. When the phrasing was long and academic, people ignored it. When we rewrote it in shorter, clearer language, we saw 53% better results. AI product design isn't about having the most advanced model. It's about making AI feel natural, trustworthy, and effortless for users.
Behavioral inference works better than asking users
We proved that inferring from behavior (downloads, reading patterns) works better than self-reporting for personalization. This shifted the team's thinking from "ask users" to "infer from behavior" and created a template for future AI features across the product.






