The Crisis Text Line platform knows a lot about our Crisis Counselors. It knows when a CC has been involved in an active rescue. It knows when a CC has been volunteering for over a year, or when a CC has helped 500 texters. When a supervisor sees a Crisis Counselor performing exceptionally well, the supervisor can make a note of it inside the platform. And when we want to leave feedback on a specific conversation, we can do that in the platform too. But even though the platform is the best way for Crisis Counselors to communicate with texters, it’s not the best tool for Coaches to keep up with all that’s happening in their Crisis Counselors’ worlds. For that, we use Salesforce, a customer relationship management tool.
Salesforce allows our Coaches to see, at a glance ,everything important that has happened, so that they can efficiently follow up on events and reach out to Crisis Counselors– whether that be to say thanks for sharing so much of their time with us or to offer support after a difficult conversation. So we needed a way to get all that important information out of the platform and into Salesforce.
We designed a solution with three principles in mind: fault tolerance, scalability, and speed.
Fault tolerance: Fault-tolerant means that if something goes wrong while we’re sending information into Salesforce, it doesn’t affect the platform in any way. We must always be able to respond to our texters, even if some other piece of the platform isn’t working.
Scalibility: because we’re growing so fast, we need our data exports to be able to keep pace with our growth
And we wanted the exports to happen in real-time. We didn’t want coaches to have to wait for an overnight push of data in bulk, for example. When a Crisis Counselor experiences their first active rescue, their Coach is going to want to check in with them quickly to make sure they feel supported.
To accomplish our goals, we turned to Amazon Web Services. Whenever something important happens on the platform, such as an active rescue, the platform dumps the information into an Amazon S3 bucket. Then, an S3 event triggers an Amazon Lambda function that we created. This function retrieves the file from S3, parses the data inside, authenticates with Salesforce, and makes API calls to insert or update records with the data from S3.
With Amazon S3 and Lambda, there is no additional infrastructure for us to maintain or scale. Amazon handles all of that behind the scenes. And is it fast. By the time our platform has loaded for a Crisis Counselor, Salesforce already knows about their login. What’s also great about this is that the platform knows nothing about Salesforce. It’s just pushing data, and we can attach any system that we need to listen for that data.
However, there is always a potential for problems.There are two primary main points of potential failure with this system: the sending of data from the platform into S3, and the sending of data into Salesforce. In both of these steps, we have built mechanisms to store and retry sending data whenever something goes wrong. After 10 failed attempts, our engineering team receives an alert so we can figure out what’s wrong.
When you’re building software to help people in crisis, you want that software to work, even when some things go wrong. Separating the concern of calling Salesforce APIs to import data helps to isolate the risk associated with that concern. When Salesforce goes down, the platform keeps on ticking. And that means we never stop being able to help the people who depend on us.