Escalating Flagged Conversations Guide
Review and manage moderated and flagged messages in your Playlab apps
What is this feature?
Playlab now supports Escalating Flagged Conversations, allowing builders, workspace owners, and organization admins to receive notifications when conversations with their apps contain moderated or flagged content. You can now review these conversations and ensure your apps maintain high-quality interactions.
All automatically moderated messages and manually flagged inappropriate messages are hidden from view in the conversation by default to protect users from potentially inappropriate content.
Moderation is still in beta and will occasionally make mistakes.
We are in the process of testing an improved user input moderation solution - expect changes in the upcoming weeks.
Rationale for the feature
Playlab users have been asking for improved visibility into moderated or flagged conversations with their apps. With Escalating Flagged Conversations, administrators are promptly notified when moderation occurs, enhancing transparency and enabling quick action.
This feature makes it simple to identify and review potentially problematic conversations, allowing you to update your apps as needed and maintain a safe, high-quality user experience.
This feature also allows users to flag conversations that they deem to be inappropriate or biased. This will then inform the creator of the app via email.
How do notifications work?
User Flagged Conversation
User flagged conversations trigger immediate notifications to app creators, empowering users to provide direct feedback on conversations they find problematic.
Reviewing Flagged Conversations
The flagged conversation review interface provides a streamlined workflow for administrators to assess, address, and resolve reported issues efficiently. You can see separate icons for flagged user conversations and messages that were moderated. They will appear in the activity page of your Playlab workspace.
Automatic Moderation Notifications:
When a message is automatically moderated, the app creator, workspace owner, and organization admin will receive an email notification different from manually flagged conversations.
Accessing Moderated Content
Navigate to Workspace Activity
Access your workspace activity page to view all conversations
Filter by Conversation Type
Use the new Conversation Type filter to quickly find flagged or moderated conversations
Review the flagged conversation
Click on the conversation to open the full thread to review any messages
View moderated content
Use the “Show Message” toggle to reveal the moderated message content (only available to authorized users)
The toggle will display:
- The full content of the moderated message
- The category of violation (e.g., “Moderated for: violence”)
Remember that moderated content may contain inappropriate material. Review with caution.
Tips for Managing Flagged Conversations
Best Practices
- Respond promptly to notifications: Review flagged conversations as soon as possible to address any issues with your app.
- Analyze patterns in flagged content: Look for trends in the types of content being flagged to identify potential improvements for your app.
- Update your app’s instructions: If you notice recurring issues, consider updating your app’s instructions to prevent similar problems.
- Follow up with users when appropriate: For identified users experiencing issues, consider reaching out to provide support or clarification.
- Document your resolution steps: Keep track of how you resolve flagged conversations to improve your response in the future.
- Consider required guardrails: For apps dealing with sensitive topics, implement clear guardrails in your instructions to prevent moderation issues.
FAQ
We Want Your Feedback!
Have you tried reviewing Escalating Flagged Conversations? We’d love to hear about your experience!
Contact us at [email protected]
Return to Home Page
Last updated: 3/27/2025