Mental health crisis intervention based on analytics and ML

Hanif Khalid, Senior Director of Product Management at PBS:

For over 50 years PBS and our stations have served the American public with content that educates, informs and inspires. We're different than commercial media. We're not a network. We're a nonprofit membership organization, serving 330 member stations across the US. And our member stations reach 96% of the US population.

So while our mission has stayed the same over the years, the media landscape has significantly changed. With our very important partnership with AWS, we're leveraging cloud and data as a big part and a key part of our digital transformation.

As we scaled our digital footprint from a broadcast to a multi-form organization, we faced multiple challenges. While PBS is unique in a lot of ways, you'll see that some of our data problems are not. And I'm sure as I share some of those problems, they might resonate with what you might have seen in your organization.

So why do we have a data problem? Well, first of all, we're collecting a tremendous amount of customer data, but all that data is getting collected in siloed systems. In reality, we have multiple sources of truth which is creating this disjointed view of our audience.

Once our data maturity strategy was established, we moved into execution mode which was building a unified and governed data lakehouse. This is where we just start consolidating all our different data sources and to help us uncover new actionable insights. A key principle for us was that we wanted to democratize data for both our internal and external stakeholders, external stakeholders being our stations.

So rather than data being accessible just by a privilege for employees in our org, we wanted to have that data consumable across the entire organization. And we really feel that ultimately, that this lakehouse is going to improve our data maturity so we can start using data more as a strategic resource.

Next, I'm just gonna share a couple of examples of how the data lakehouse is already starting to benefit us. So starting with the PBS recommendation engine...we wanted to help our viewers discover more PBS programs. We wanted to create personalized experiences for our viewers and we wanted to highlight the great station, local content for their audiences.

Since that prototype, we have now products and released the recommendation across all PBS digital platforms. Another example that I wanted to share with you all was that the innovation product team at PBS has also started experimenting with AI/ML and we're we've predominantly been been looking at models from OpenAI to begin with.

So we're exploring various use cases like, you know, writing in our brand voice or creating call to actions for our PBS mobile apps. What our innovation team has found is that this has been really helpful in creating metadata, tags, descriptions or titles from things like our caption files. And we found that this can be a huge time saver for us.

So those are just two very specific examples. But overall at a bird's eye view, the lakehouse project, we truly feel is going to unlock immense potential across our entire system.

To wrap up, I wanted to share kind of our top 3 lessons learned so far, if you're just getting started with data strategy or thinking about setting up data infrastructure at your organization:

  1. Getting organizational commitment - Building data maturity, building data architecture, it takes time and setting the realistic expectations with your stakeholders is only going to serve you better, because you need that runway while that foundation is getting worked out.

  2. Having a clear "why" - Developing a data strategy that's aligned with your business objectives. Also, this is another place where you might want to ask yourself, do you actually have more than a single data use case? Because a data mart, especially if you have this one use case, a data mart might be all that you need. But if you know that the data is going to scale in the future and you're going to have other use cases, then it's absolutely worth investing into a data lakehouse.

  3. Focusing on data quality, data governance and data privacy - Data is what trains models but privacy and governance dictates usage.

Thank you everyone for your time. But next, I'm gonna pass it over to Glenn to share the Stop Soldier Suicide story.

Glenn, CTO and Founder of The Black Box Project at Stop Soldier Suicide:

I'm going to have to talk to you about the Black Box Project - it's using digital forensics to look at the last second, the last year of life of somebody who has committed suicide as a veteran/service member and then using machine learning to identify these patterns and trends online.

So who is Stop Soldier Suicide and what do we do? Our main focus when we first started was to bring service members and veterans in crisis and have a call center for them to call in and make sure they get the resources they need and then go through a clinical process so we can make sure that they're getting the help that they need.

We've served over 1400 clients. We have over 700 hours provided care to them - 23% increase year over year. And it's sad to say that that's needed.

I spent 11 years working counterterrorism and intelligence around the world. I was a Black Hat instructor for four separate times.

So I understand cybersecurity at another level. I spent nine years working digital forensics for child crimes. For a year, I worked for Homeland Security watching child crime videos, doing digital forensics for it. And then I have a foundation called the Sentinel Foundation where I've conducted 25 plus operations in 12 different countries helping children that are being sex trafficked. I just got back 10 days ago from Gaza of getting Americans out. So we're very good at solving complex problems with innovation and data.

So Black Box History, how did we start? How did we come up with this idea and the concept? So back in 2019, the CEO over at Stop Soldier Suicide, Chris Ford called me up and said, hey, I got your number. I want to figure out how we can use data innovation to solve the suicide problem. And for me, it hit home personally. You know, it's tough to say it's like at 22 years old, you know, I was writing eulogy for my best friend to kill himself after Iraq. You know, I've had five friends since that date and time kill themselves. And it, it just, it's a lasting effect on everyone that, that plays a part. So I was, I was ready to jump at it any which way I could.

And when I, when I sat down with them, I started thinking through the problem set. I started looking back at all the people I've helped that were suicidal with my friends. And I realized one thing that they were all posting the same thing on Facebook. They were all posting the same thing on Instagram of calling a buddy and helping a buddy. So they were projecting out words of their needing help and it, it really opened my eyes of, hey, there's something here. If this guy is suicidal, I know and he's posting these, these memes online, then there's, there's something that's not true in that data.

And so I realized quickly and Elon Musk said, is your cell phone is your fifth limb? And we're leaving all of these devices behind. And if you understand from a security perspective that these devices in a matter of six months to a year, two years, we have a wealth of knowledge that we can actually get into. Now, your phone right now is hard to get into. But in six months, it's not in 12 months, it's even easier. And so it was, it's a time space and distance with security that we had. And so we had to figure out. So I proposed the idea of let's just look at the last second, the last year of life to find out. And that's where we came up with the Black Box project. If we were still trying to solve aviation problems without having a black box, we would not be anywhere. We are in the aviation field that we are right now, right? And so we had to see what was going on and what was happening so we could then prevent and predict it later online.

And so, you know, I thought like this is a great idea, we can go through it. Now, the hardest part was acquiring these devices. And so I spent the next six months. Well, COVID happened and we said, let's just bootstrap this thing, whatever we need to do, I was like, I'll come over, we'll build through all the old equipment. We have, we started piecing it together and I just need the missing piece was the family members. And so we start reaching out to our community and I start meeting with them. And if you ever dealt with a suicide situation, you go into the household, the mother's suicidal, the father's suicidal. It's a generational impact from this act that happens, but they want to help, they want to get back. And so they had these devices and I, I became friends with, you can see from these two pictures. I go on fishing trips with them. I go golfing with them. We became a family of their first original five individuals.

And so we started looking through it. So once we have the devices and as of to date, we have 78 different cell phones from 68 different families. We have laptops, hard drives and tablets that will be later down the line. Right now, we're currently focused on the cell phones and you can see the left is during COVID, we just built this makeshift forensic lab with old equipment we had. And now you can see on the right, we have a government level equivalent with cell, right? Donating all the software that we needed for free. We have Magnet Forensics. And so once we had that, we, we were missing the one piece, right? We know our strengths and our weaknesses and we know we needed a team that was machine learned based and the best in the world.

And I was on a call with Amazon just for fundraising. And we told them a little bit about Black Box and that was the turning point for this whole project. Amazon, I've had, I had 10 other people tell me, no, they don't want to be a part of this project. Amazon said it's one of the most innovative ideas they seen, they would love to take any part of it and it validated everything that we thought we knew and we had.

And so we spent the first few months going through all these different cell phones and I started seeing that we started seeing trends that this is, oh, this is a spur of the moment act. There was no data to prove that it was a spur of the moment. Well, I recovered a deleted suicide note three months prior through that theory. Right out, we saw that five of the seven clients that we originally got in all went to the VA that day. So it's not only about preventing predicting, but it's also accountability of what we're seeing through this data.

And so by partnering with Amazon, we then we then roadmap for the project. And so let's look at what we learned. So when I first started looking at this problem set, I had to think of myself and I know when I'm hungry, I'm irritable. And you can, the only way you can tell on a cell phone to have if I'm irritable is probably my text messaging to you. And so we said, ok, let's do sentiment and emotion algorithms or analysis over, over their cellphones and let's look at all outgoing messages and try to identify, I identify what a good day is and what a bad day is.

And then once we have the good day and the bad day, then we can narrow down and say what were they doing on that day? So we could, we were trying to find them in the mess. When we first started, we were trying to get better at targeting individuals to get them in the wellness center. And so we started looking through it. So I mean, majority, you're familiar with this. We had to look at positive, negative, neutral and then we had to look at anger, disgust, fear, joy, neutral, sadness and surprise. And this is what we came up with.

So we have our dashboard that we originally built here and you can see the red to the green on the screen. And as they get closer, fear and anger, you can automatically see it gets higher and higher. And so we're looking at these spikes and I want to know what they're doing on a good day also because what, what activities were they at? We could be, we target them over there. We need to know these patterns that are on the bad days of how do we target them.

And so we rank them by numbers and scores and go through and real quick, we realized that the military community is not the easiest community to analyze sentiment or emotion because we probably curse every 15 seconds. I'm gonna try not to do this presentation, Amazon asked me not to. But so we use, we use Comprehend. We, we, we use the commercial off the shelf and we'd realized real quick, we had to create our own. And so that's, that's what we went through and we started to do and with it.

So after that, after we, we, we create our own sentiment and emotion algorithm using Amazon. I'm going to say Microsoft Amazon Comprehend. They, we had to look at the semantic search of. So if I'm looking at somebody, if you're not familiar with it, I want to know if somebody is an alcoholic, you know, I need to know if they're going out partying, I need to know if they're drinking. So I need every one of these different phrases. And so we go through here, we went through the chats messages and then we put in "I am in pain." "I want to kill myself." "I hate myself." "I need help." So are reaching out for help and then "I want to get wasted," you know, on financial help. So by putting these, these sentences in there, it's going to get everything along those lines. And then that way when we're going through it manually, we could then go down narrative. How are they feeling? Could we really understand a lot about that individual? You can see it here, these are live results from it.

So we put our input in, we upload through the S3 process. We all set up, we get the output and we can start narrowing down on the phone because if you're not familiar, it's, it's a massive amount of data. When you dump a cell phone, I have everything from your health data, your outgoing messages, your social media contacts. And so this is at that point in a project of really trying to understand this data set because forensic tools were meant to categorize data for courts of law. They weren't meant to profile individuals based on their behaviors.

Now, it's just like Amazon knows exactly what I want to buy from the store. That's why they take all my money is because they're really good at advertising to me what I want and who I am. So we look through, we parsed through, this is only the results you're seeing are only from 13 devices. We're, we have about 58. So out of the original 1913, we could get into and we had quality data. Now we're up to about 58 cell phones. Those results are still processing. We're still going through it. But you can see that, you know, there's veterans that were when I was reading through the messages and now the, the, the software brought it out was "Sponsored, did not take them to the VA killed himself." We had them that they went to the clinic, didn't get the help they need, killed themselves. We had their financial situation, they go through doors to kill themselves. And we know this data already. We all know that these major acts happen. You have a very short window if somebody is on that brink. But now we have the data to back it up and show it after that we had to know is there's a lot of talk about sleep deprivation.

But if you kill yourself, they're gonna ask your wife, you're gonna ask your husband, they're gonna ask your significant other. They're gonna ask your family member of how well were they sleeping? And you're going to say as, I don't know, I really don't. Well, with our phone being our fifth limb, we knew that our phone could be that missing piece. And so we looked at message ranges and we wanted to find between zero, you know, midnight and 04 a.m. and tried to identify of what they were doing, how active.

So we created our own sleep analysis algorithm we looked at message ranges, social media posts, Bluetooth devices, connected phone charging, health data steps and came up with this comprehensive sleep algorithm. And we see that majority of all of our clients from this service members that we we've studied sleep deprivation is up here. We see that anger fear is up here, isolation, frequency of text messages spiking upward.

And so now we have the data to prove it. And so that's what we did. We went and we combined sleep, sentiment, isolation and emotion trends altogether. As you can see the diagram here and then kicked out the probability mathematically of, were they sleep deprived isolation, fear and anger and they're all showing the upward trend.

So you can see here their start time increases. So when they go, how long they stay asleep, because the first thing most people do is they pick their cell phone up after they wake up, they want to check their messages, especially if they're in a depressed area, they're looking on their instagram, they're scrolling so they can get that dopamine fixed. So we can actually see this.

And so where are we going? So we have all this data and we have an understanding of what is happening. But what are we going to do about it? You know, it takes great minds and people to come together, we know our limits and i challenge everyone of you in your room to think you could be a part of this project to reach out to us. There's so many in this room that could be that missing piece that we need. But what we're doing right now is we're incorporating what our learnings into our call center, right? So we're looking at our sent in motion motion algorithms when we have clients call in, we're monitoring our, our, our wellness coordinators and we're also monitoring our clients to make sure that they're both trending positively.

So if you call in, i can understand if you're having a bad day, but there's a, there's a trend positive after our wellness coordinator is communicating with you. And we're using amazon contact lens for that initial project that we're doing.

The second part is creating an sdk. So if you have an app out there and we could partner with you and give you our sdk that will run across your back to say that your clients are potentially suicidal. And here's why. And then our other goal is to create our own app called the roger app. And it's a resource app for our clients. So we can monitor them in real time and see them. And we know we just, you have to call them up and say, hey, they're hitting these thresholds, maybe all from a gym membership, call them up, check on them. The littlest things like that could really change a person to not commit the act.

This is for amazon. I mean, this project would not exist without them. You know, like i said, in the beginning, when they validated our, our, our hypothesis, this is where they took us to the next level. So we're the first two time winner for the imagine imagine grant winners. And then i have to, i have to thank our partners we have in there. So, amazon nonprofit aws amazon, i mean, i think i've dealt with all of amazon at this point. Um and i love it. Ii, i would challenge any one of you that are thinking about using proserve, i to use them. i have to give a big shout out to sweater, the program manager and tj gallagher, they were critical in successors project. you have celebrating magna forensics donating for free. You know, these are hundreds of thousands of dollars of software donating for free augusta, digital, digital and per vida solutions are partners in creating the processing of this data.

So that being said, I'll turn it over to vina right now. I thank you so much for being a part of this and taking the time. I know where the last session everyone's ready to go party in vegas. So thank you so much.

So let's say like I said, we built this ourselves sitting on prem and the data for that is getting stored in a SQL Server sitting on premises. So we're not going to use Amazon AppFlow for that. It's not meant for that. So instead we're going to use a different service - IWS Database Migration Service, which is which you can see in the top right hand corner.

So DMS, DMS sorry will help us either migrate the whole database into AWS if we want to just move the whole thing and work from there, or it can perform ongoing replication between your on premises database and AWS.

So there we go, there's our two data sources already handled, but let's just try a few more on for size.

Um so let's say that you've got a website and you want to be able to gather click stream data from that website, maybe give you more information about what people are actually clicking on. So if you want to stream that data into AWS, you can use something from the Amazon Kinesis family which you can see in the middle.

And then what say that you wanted to acquire maybe a third party data set, maybe something from census information or address information. So you can acquire third party data sets through AWS Data Exchange. We you can see in the bottom right hand corner that has over 3500 different third party data sets that you can purchase sometimes even for free.

So once we've ingested data into AWS, we have to store it somewhere and we're gonna start moving a little bit more quickly throughout these different stages.

Now, so you've ingested where to put it. So we're gonna put that in Amazon S3. So we talked about S3 a little bit. This is really gonna be the foundation of our data lake and you'll notice that I've got Redshift listed up here as well.

So, Amazon Redshift is our data warehousing service. But our question that we're trying to answer is actually pretty simple. We only need two data sets for it. Let's just say they're relatively small. We're not doing large scale massive analytics here. So let's just say Redshift overkill for now, we're going to stick with S3 and just into S3. And if we want, we can always add Redshift on later.

So we're keeping things very flexible by using S3 as our starting point.

So now that we've got our data properly stored, we need to process it, transform it in some way to get it into a state where it's very usable.

So unless you're extremely lucky, most of your data sets as is are not in a place where they can be used super cleanly. So an example of that would be, let's say that you've got your donor management system sitting in Salesforce, you might not have every single value about every single one of your donors filled in. You might have lots of null, null values, not applicable, et cetera.

So that can be kind of messy if you're trying to run queries on data that looks like that. So you can use a service called AWS Glue ETL, which is extract, transform and load to get that data process and into a form where it's going to be usable.

And then next step, you can store all of your metadata in the AWS Glued data catalog. If you're familiar with the Apache Hive meta store, it's compatible with that very similar concept. But really this is where you're storing the location of your specific data.

So like the path to the files in S3. Um it also includes information about the schema and the format. And the purpose of this is for a querying engine to be able to know how to work with your data.

So it's going to consult the data catalog to know where is your data set, first of all, and then how do I actually treat it?

So that brings us to querying engines. So analyze and visualize.

So we've got our data stored, we've got it cleaned and now we want to be able to do something with it. And this is really where we get to the juicy part and we're gonna get our question answered. We're gonna figure out which one of our donors are most likely to convert.

So in our case, we're gonna use Amazon Athena, which is a perfect example of a service that consults the Glue data catalog. So Amazon Athena is a serverless SQL querying engine.

Um it sits right on top of Amazon S3. So you'll notice I did not say that you had to put all of your data into a specific database to run queries on top of it. Athena can kind of layer right on top.

So this is where we're gonna answer our question. We're gonna use Athena to consult our two data sources, the events platform and the donor donor database. And we're gonna figure out who's most likely to convert.

So maybe we can kind of hypothesize that somebody who donated over $100 this past year and attended one of our events in the last month is most likely to convert. Maybe we're just thinking they're most likely to want to connect with us a little bit more.

So boom question answered, we solved our problem. But since that went so well, let's try to do a little bit more with it.

So that brings us to predict and share where we can start to apply machine learning techniques.

So we already talked about a few different machine learning tools that you can use for your use cases. Um we talked about Comprehend mentioned personalized recommendation engines. You guys have probably heard a whole lot about Bedrock and Q.

So lots of different AWS AIML services that you can. But let's say for our use case, we have a non technical user in our organization who wants to be able to ask questions about our data in natural language to get answers.

So they don't know how to write in SQL. Very common. They can use Amazon QuickSight or possibly Amazon Q to be able to get that question answered without having to know a bunch of SQL.

And then let's say that since things are going so well in your organization, everybody is so impressed with all the work that you've been doing. You might have another team in your org that wants to be able to get these data sets shared to them.

So they can start doing stuff too. You can use Amazon DataZone to share that data with, with um the rest of your teammates.

So that brings us back to the beginning. We're back at our original pipeline. We just flushed it all out. This is meant to be extremely flexible. You should be picking and choosing the services that make sense for your use case. And if business requirements change, you can go back and adjust.

So the purpose of putting things into a data lake is so that you're going to have a very flexible starting point which makes the downstream stuff way easier as things change and you grow and um your use cases evolve.

So you're gonna find that probably the first time you go through this process might take a little while, you might be trying to figure out which services you should be using or how to use them.

Um but as you go through this a couple of times, you're gonna come up with new questions to ask, you're gonna be ingesting more data sources as you go. And before you know it, you have become a data driven organization.

So now that we've hopefully gotten the gears turning a little bit. How do you actually get started? You might want to have a cool story to talk about. I re invent a year from now on stage. No problem. I got five steps for you.

So bullet number one is really the most important. Pick an important but not urgent business problem to tackle. So you want to be asking your organization a question that's actually attainable for you to answer. This shouldn't require a million different data sources to answer. You want to set a goal that's reasonable so that you can get a quick win.

That way, when you get to the urgent business, critical stuff, you've already gone, gone through this process a few times and you're familiar with the tooling. So really the technology comes second, you wanna pick a good question. That's the most important thing.

Next step is determine what data you need to answer this question. So in our case, we needed Salesforce or um that homegrown events platform, but you might have different sources that you need to be able to answer your question. So figure out what those sources are and incrementally populate your data lake. You don't need to bring everything in all at once.

Next up, run analytics. So in our case, this was a really simple SQL query, but you can imagine that this could be using an AI tool. Um something cool with machine learning whatever it is, maybe building out a business intelligence dashboard, run analytics to actually get you an answer to your question.

Next up, take action based on those insights. So in our case, we think that we've got five people who are most likely to convert to become monthly donors, we don't really know. So maybe we could track those people over time and test our hypothesis or maybe we can nudge them a little bit and send them a personalized text message to see if they want to come back.

Last step is measure against your original goal. So did you actually accomplish what you set out to accomplish if you did? That's awesome. But if you didn't, that's actually ok because this is an opportunity to adjust, try different things out until you solve the problem that you came to solve.

So this is not meant to be a one and done transformation. This is going to be an iterative approach and being ok with that will help you build momentum and set goals for yourself that actually feel realistic and attainable.

So start small but think big because this stuff is really cool.

So we have loads of tactical support that we can provide you with as you're getting started on this journey.

So you may already have access to your account manager and solutions architect that can help you build an architecture diagram that looks kind of like the one we saw before we can do demos of services help you build proof of concepts, all that stuff is already available to you.

We also can provide you with more strategic support. So if you want to get a sense for where you are in that maturity model that Haneef showed we can do something like that, we can help you build a road map going forward that focuses on your business needs over the next few years.

So we can provide all kinds of different support. But really all this happens when you come and speak with us.

So you guys already did the hard thing. You bought a ticket to re invent. You flew all the way here. You've been in Vegas for four, sometimes even five days you sat through this session.

So the best way to make this worth your time is to come speak with us and start a dialogue with your rep at AWS.

And on that note, here is our contact information, we're all gonna sit around for a little bit to answer questions if you don't know who your AWS account manager is and you want to get in touch. You can use this QR code to get um contact information for that individual.

And yeah, thank you all very much for joining and thanks so much to my co presenters.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值