移动应用程序和网页应用程序_移动应用程序的站点可靠性工程

移动应用程序和网页应用程序

理论指导 (A Theoretical Guide)

You must have heard about the term Site Reliability Engineering (SRE) and ever wondered how the same can be applied to mobile applications? Well, before diving into ‘How’, let’s have an understanding of SRE first.

您必须听说过“ 站点可靠性工程(SRE) ”一词,并且想知道如何将其应用于移动应用程序吗? 好吧,在深入探讨“如何”之前,让我们首先了解SRE。

The term site reliability engineering was originated by Benjamin Treynor, VP Engineering at Google.

站点可靠性工程一词起源于Google的工程副总裁Benjamin Treynor。

Ben defines SRE as: “It’s what happens when you ask a software engineer to design an operations function”. Hence creating a bridge between development and operations by applying a software engineering mindset to system administration topics.

Ben将SRE定义为: “这是当您要求软件工程师设计操作功能时发生的事情”。 因此,通过将软件工程思想应用于系统管理主题,可以在开发和运营之间架起桥梁。

In other way, it is like building self-service tools e.g. automatic provisioning of test environments, logs and statistic visualisation. SREs collaborate closely with product developers to ensure that the designed solution responds to non-functional requirements such as availability, performance, security, and maintainability. They also work with release engineers to ensure that the software delivery pipeline is as efficient as possible.

在其他方面,这就像构建自助服务工具,例如自动提供测试环境,日志和统计可视化。 SRE与产品开发人员密切合作,以确保设计的解决方案能够响应非功能性需求,例如可用性,性能,安全性和可维护性 。 他们还与发行工程师合作,以确保软件交付管道尽可能高效

Ben and a group of Google engineers have also written books on this which you can have a look from here, Site reliability engineering books or if you want to read about it in a nutshell then I would love to write a separate story on that topic (give me a shot in the comments below).

Ben和一组Google工程师也撰写了有关此书,您可以从这里查看; 站点可靠性工程书;或者如果您想简要了解它,那么我很乐意为该主题写一个单独的故事(请在下面的评论中给我一个机会)。

Coming back to applying reliability engineering on to mobile applications. So, before jumping onto ‘How to apply’ let’s discuss about why we need to apply reliability engineering in mobile applications.

回到将可靠性工程应用到移动应用程序上。 因此,在进入“如何应用”之前,让我们讨论为什么我们需要在移动应用程序中应用可靠性工程。

为什么在移动应用程序中需要SRE? (Why we need SRE in Mobile applications?)

Mobile apps are becoming more and more complex these days. Whether it is a simple Todo app, calendar app, cab booking, socialising or food ordering app, it becomes very necessary for engineers/companies to monitor the performance, security and availability of the app. Let’s consider few scenarios:

如今,移动应用程序变得越来越复杂。 无论是简单的Todo应用程序,日历应用程序,出租车预订,社交应用程序还是食品订购应用程序,工程师/公司都必须监视该应用程序的性能,安全性和可用性。 让我们考虑几种情况:

  • User opens your app and a message displayed saying “application has stopped” or “application not responding.”

    用户打开您的应用程序,并显示一条消息,指出“应用程序已停止”或“ 应用程序未响应”

  • User tried to navigate through a feature and so clicked a button but there is no sign of responding to the tap.

    用户试图浏览某个功能,因此单击了一个按钮,但是没有响应该点击的迹象

  • Your server is successfully returning valid responses but user sees a blank screen on the app.

    您的服务器成功返回有效响应,但用户在应用程序上看到黑屏

  • A feature is available and working fine in a particular geo-location but crashing in another, where it supposed to be working.

    一个功能可用,并且在特定地理位置上可以正常工作,但在另一个应该可以正常工作的位置崩溃

  • You get bad reviews of your app on play store including battery drainage issues report.

    您会在Play商店中对您的应用进行不良评价 ,包括电池耗电问题报告。

Image for post
www.tumbler.com www.tumbler.com

Being an application developer, you developed all the features in a full-fledged app, but ever wondered how these can be reliable considering above real-time scenarios? That is why SRE comes into action and the same should be followed with mobile applications. Let’s finally dive into the ‘How’ part. :)

作为应用程序开发人员,您开发了功能完善的应用程序中的所有功能,但是否想过在上述实时场景下这些功能如何可靠? 这就是SRE付诸行动的原因,移动应用程序也应遵循相同的原则。 最后,让我们进入“操作”部分。 :)

如何在移动应用程序中使用SRE? (How to use SRE in mobile applications?)

When we think of websites and apis (sending requests and getting response), we have various tools and ways which we can use to monitor everything and the changes/updates are just a deployment away. But when it comes to mobile application we need to publish a new apk/ipa file onto store and wait for user to update our app. In the following sections, I’m writing down some of the measures according to SRE principles which you should consider along with mobile app development.

当我们想到网站和api(发送请求并获得响应)时,我们有各种工具和方法可用于监视所有内容,而更改/更新仅是部署而已。 但是,当涉及到移动应用程序时,我们需要将新的apk / ipa文件发布到商店中,并等待用户更新我们的应用程序。 在以下各节中,我将根据SRE原则写下一些措施,您应该在移动应用程序开发中考虑这些措施。

  1. Availability of Apps

    应用程序的可用性

    To understand the availability of the apps we need on-device, client-side telemetry to measure and gain visibility.

    要了解应用程序的可用性,我们需要在设备上进行客户端遥测,以测量并获得可见性。

    If you can’t measure it, you can’t improve it. So, what are those:

    如果无法衡量,就无法改进。 那么,这些是什么:

Image for post
  • Crash reports: A crash can occur for a number of reasons and it is a clear signal of app unavailability. Solutions like Firebase Crashlytics can help collect stack trace and give you a clues on various factors like where in code, version, model and locale. Hence, you can decide whether the issue can be mitigated by pausing the rollout, changing the config flag or updating the server response.

    崩溃报告:崩溃可能有多种原因,这是应用程序不可用的明显信号。 Firebase Crashlytics之类的解决方案可以帮助收集堆栈跟踪,并为您提供各种因素的线索,例如代码,版本,模型和语言环境的位置。 因此,您可以决定是否通过暂停推出,更改配置标志或更新服务器响应来缓解此问题。

  • Analytics: There are performance monitoring solutions such as Firebase Performance Monitoring that capture and transport logged events from

    分析:有一些性能监控解决方案,例如Firebase性能监控 ,可以捕获和传输来自

    mobile devices and generate client-side SLI metrics which can be used and presented for production monitoring or analytics. There are third party frameworks in market which can be used to monitor the same e.g.

    移动设备并生成可用于生产监控或分析的客户端SLI指标。 市场上有第三方框架可用于监视同一框架,例如

    New Relic, Moengage etc.

    新遗物工程

  • Error Logs: You can set up your own custom logging apis for analytics and error logging. Custom logging apis are light weight apis which are designed as per the product requirements and the same can be used across platforms for logging user events, click actions, user behaviours, api errors and much more. The data will be dumped onto your server and then can be converted into metrics using some dashboard or tools. The same can be helpful for both product and operation teams.

    错误日志:您可以设置自己的自定义日志API进行分析和错误日志记录。 自定义日志记录api是轻量级的api,根据产品要求进行设计,并且可以在各种平台上用于记录用户事件,单击操作,用户行为,api错误等。 数据将被转储到您的服务器上,然后可以使用某些仪表板或工具将其转换为指标。 对于产品团队和运营团队而言,这都是有帮助的。

2. Performance & Efficiency of AppsThe battery is arguably the most valuable resource of a mobile device. Mobile apps on a device share precious resources such as battery, network, storage, CPU and memory. And you will certainly not want your app to be at the top of the battery or network usage list and attract negative reviews.

2.应用程序性能和效率电池可以说是移动设备中最有价值的资源。 设备上的移动应用共享宝贵的资源,例如电池,网络,存储,CPU和内存。 而且,您当然不希望您的应用程序位于电池或网络使用量列表的顶部,并吸引负面评价。

Image for post
www.wordpress.com www.wordpress.com
  • Android Vitals: It is an initiative by Google itself to improve the stability and performance of devices. When an opted-in user runs your app, their Android device logs various metrics, including data about app stability, app startup time, battery usage, render time, and permission denials. The Google Play Console aggregates this data and displays it in the Android vitals dashboard.

    Android Vitals:这是Google本身的一项举措,旨在提高设备的稳定性和性能。 当指定用户运行您的应用程序时,他们的Android设备会记录各种指标,包括有关应用程序稳定性,应用程序启动时间,电池使用情况,渲染时间和拒绝权限的数据。 Google Play控制台会汇总这些数据,并将其显示在Android vitals仪表板中

  • Profiler: Developers need to do a variety of internal testing to collect statistics on mobile system components such as battery, memory and binary size. Any unexpected regressions are triaged and fixed before launch. As a result of this process, much of the system health testing is automated and reports are easily prepared for review.

    Profiler:开发人员需要进行各种内部测试,以收集有关移动系统组件(如电池,内存和二进制大小)的统计信息。 在启动之前,将对所有意外的回归进行分类和修复。 作为此过程的结果,许多系统运行状况测试是自动化的,并且可以轻松地准备报告以供审核。

3. Rollout StrategiesWhen releasing client applications, best practices are particularly important to SRE because client rollbacks are near impossible and issues found in production can be irrecoverable and can erode user trust, which can even lead to app uninstalls. Following features can be taken into consideration for the safety of releases:

3.部署策略在发布客户端应用程序时,最佳实践对SRE尤其重要,因为客户端回滚几乎是不可能的,并且生产中发现的问题可能无法恢复并且会削弱用户信任度,甚至可能导致应用程序卸载。 为了安全起见,可以考虑以下功能:

  • Staged Rollout / Phased Releases: All changes should go through some sort of staged rollout before releasing fully to external users. This allows you to gradually gather production feedback on your release rather than blasting the release to all the users at once.

    分阶段发布/分阶段发布:在完全发布给外部用户之前,所有更改都应经过某种分阶段发布。 这使您可以逐渐收集有关发行版的产品反馈,而不必一次将发行版发布给所有用户。

Image for post
  • A/B Analysis: Releasing all changes via experiments and conducting an A/B analysis. Control and treatment group selection should be randomised for every change to ensure that the same group of users are not repeatedly updating their applications.

    A / B分析:通过实验释放所有更改并进行A / B分析。 应对每个更改随机选择对照组和治疗组,以确保同一组用户不会重复更新其应用程序。

Image for post
  • Feature flags: New code is released through binary release and should be disabled by using a feature flag by default. Releasing code through the binary release makes the feature available on all users’ devices, and launching the feature with a feature flag enables it for a smaller set of users that is controlled by the developer. Rolling back the feature flag is as simple as ramping the launch back down to 0%, instead of rebuilding an entire binary with the fix to release to the world.

    功能标记:新代码通过二进制发行版发布,默认情况下应使用功能标记禁用。 通过二进制发行版释放代码,可使该功能在所有用户的设备上可用,并通过带有功能标志启动该功能,使其可用于由开发人员控制的一小部分用户。 回滚功能标志就像将启动回落到0%一样简单,而不是使用修复程序重新构建整个二进制文件以向世界发布。

Image for post

结语 (Wrapping Up)

So, that’s a brief about what SRE is and how it can be implemented into mobile application to make apps more reliable, secure, maintainable and scalable. I believe incorporating above techniques into management of native mobile applications certainly gives us a strategy for building reliable products and services.

因此,这是什么是SRE以及如何将SRE实施到移动应用程序中以使应用程序更可靠,安全,可维护和可扩展的简要介绍。 我相信将上述技术整合到本地移动应用程序的管理中无疑为我们提供了构建可靠产品和服务的策略。

A smart person once said:

聪明人曾经说过:

“If you like the code you wrote a year ago, you haven’t learned enough this year.”

“如果您喜欢一年前编写的代码,那么今年您还没有学到足够的知识。”

Thank you for taking time to read. Please let me know your thoughts in comments on the above implementation and how things can be improved for application development. It will certainly encourage me to learn and write more cool stuff.

感谢您抽出宝贵的时间阅读。 请在对上述实现的评论中告诉我您的想法,以及如何为应用程序开发进行改进。 这肯定会鼓励我学习和编写更多有趣的东西。

Follow me here for more:

在这里关注我以获取更多信息:

翻译自: https://medium.com/swlh/site-reliability-engineering-with-mobile-applications-66cfe9d8bd3a

移动应用程序和网页应用程序

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值