This post is not purely about technology, though it’s mostly about it. But it’s also about development, strategy, management, and growth. Let me share with you some thoughts of former CTO on how you could approach building competencies (of your company or yourself) for analytics in Microsoft Azure. The first part is for an individual.
The purpose of this blog post
For the last 4,5 years, before I rejoined Microsoft, I was involved heavily in development of cloud analytics competencies, mostly on Microsoft Azure, in the companies I’ve worked for. For the last 3 years I have been responsible for building a development strategy of technical skills in fast growing companies on the Polish market (Clouds on Mars and Elitmind). During that time I’ve learned a lot, both from successes and failures. And I thought I could share some general suggestions and approaches to building analytics competencies in Azure, without disclosing the secrets of my current Partners.
The (not so) easy part – technology
Let’s start with technology. Azure offers many services for Analytics and AI. I’m not going to cover them all, but assuming your focus will be on descriptive and predictive analytics I would focus the most on the following Platform-as-a-Service (PaaS) services:
- Azure Data Lake Storage – scalable, secure, and highly available storage optimized for analytics, can be used as a landing zone for your data in the cloud or as the storage layer for the data lake,
- Azure Synapse Analytics – a “flagship” of analytics in Azure, a service which is a combo of data integration, enterprise data warehousing, and big data analytics,
- Azure Data Factory – a fully managed and serverless data integration solution for ingesting, preparing, and transforming the data in the orchestrated pipelines,
- Azure Databricks – fast, easy to use, and collaborative Apache Spark based analytics service,
- Azure SQL Database – relational database in the cloud, PaaS implementation of the popular SQL Server database engine, can be used as a relational storage (or even a data warehouse of a smaller size),
- Azure HDInsight – enterprise-grade, managed cluster service running popular open-source frameworks – including Apache Hadoop, Spark, Hive, Kafka and more,
- Azure EventHubs – simple, scalable, and secure service for real-time data ingestion, supports popular protocols – including AMQP, HTTPS, and Apache Kafka,
- Azure IoT Hub – managed service for bidirectional communication between IoT devices and Azure,
- Azure StreamAnalytics – serverless real-time analytics service, that can run in the cloud and on the edge,
- Azure Data Explorer – fast, fully-managed data analytics service for real-time analysis on large volumes of data streaming from apps, websites, IoT devices, and more (one of the most underestimated Azure services, in my opinion),
- Azure Machine Learning – enterprise-grade machine learning service to build and deploy models faster,
- Azure Cognitive Services – comprehensive family of AI services and cognitive APIs for intelligent apps,
- Azure Bot Services – managed service built for bot development,
- Azure CosmosDB – fast and scalable NoSQL database with open APIs, can provide analytical store to seamlessly integrate with Azure Synapse for no-ETL analytics.
In addition to the services mentioned above, Power BI should become your apple in the eye. This BI platform offered in Software-as-a-Service (SaaS) model built on top of Azure services is definitely leading the market, is used by over 150,000 organizations worldwide – including 97% of Fortune 500 companies, and integrates best with Azure services for analytics.
Why I suggest PaaS and SaaS for analytics you may ask? There are several reasons for that:
- Agility and elasticity – typically use of PaaS and SaaS leads to deliver the analytical products faster and allows quicker reactions to changing requirements (e.g. scale).
- Scalability – most of the PaaS and SaaS services offer good scale-up/scale-out capabilities allowing quick alignment of compute power and storage to the existing workloads.
- Focus on what matters – PaaS and SaaS services typically require less maintenance so you can focus more on data and process modeling to provide business value.
Of course, there still can be some cases in which running a selected setup of software on virtual machines can be a justified decision for the architecture, but in general I’m a fan of making the architecture design simple and with as little maintenance as possible, so the focus would be on the analytics itself, not the infrastructure underneath.
- Micrsoft Learn – Azure SQL fundamentals – learning path covering Azure SQL Database and Azure SQL Managed Instance, in general I recommend Microsoft Learn as your primary learning asset when building your skills in Microsoft technologies,
- Microsoft Learn – preparation resources for exam DP-203 – learning paths for Azure Data Engineer covering Data Lake, Data Factory, Synapse, Databricks, and StreamAnalytics,
- SQL Server and Azure SQL Labs and Workshops – great collection of resources covering SQL Server 2019 and Azure SQL,
- Microsoft Cloud Workshop – a set of extensive labs on key Azure workloads, including data (Azure SQL, database migrations to Azure, Synapse, Cosmos DB), Big Data & analytics (Synapse, Databricks, Azure ML, Cognitive Services, MLOps concepts), IoT (IoT Hub, StreamAnalytics),
- Microsoft AI School – a hub of learning resources for data scientists and AI engineers working with Azure AI services,
- Data Exposed at Channel 9 – a bunch of short videos, full of demos, mostly on Azure data services,
- Pluralsight – commercial option, great online learning portal with lots of cloud-oriented courses (historically they had an agreement with Microsoft and provided some free courses on Azure, but that’s not the case anymore),
- SQLBI training – probably the best courses (commercial) on tabular modeling and DAX – skills expected from Power BI developers,
- Azure Architecture – a website containing examples of reference architectures for common workloads on Azure,
- additionally, having some internal learning resources in your organization can help a lot – at Elitmind we used our own commercial Power BI online course to teach our back-end consultants Power BI, tabular modeling and DAX (it was easy for us to track learning progress having our internal tools and learning platform).
Develop your skills as an individual
Building a plan for an individual (e.g. yourself) is typically much easier than building a comprehensive skilling strategy for the whole organization even if at first sight it may look like a massive task when you think about the amount of knowledge to acquire and the number of resources available. But let me give you some basic guidance on how to learn all those Azure services to make learning process satisfying and practical:
- Start by setting up goals. Know your direction – pick one/max two specializations to follow. In my opinion, an official Microsoft certification can be a great motivator and can be used to keep the pace of your learning and to track the progress. Also, the certificates are a solid sign of your self-development and efforts in your CV and their count towards Data & AI competencies of your employer a Microsoft’s partner so it shouldn’t be hard to get a reimbursement (also ask your employer if there are any programs or vouchers that may lower the cost of your certification exam or even make the exam free).
Consider the following certifications:
- Microsoft Certified: Azure Data Engineer Associate – for cloud data engineers focused on Big Data analytics (data lakes, data warehouses, real-time analytics),
- Microsoft Certified: Data Analyst Associate – for data analysts and BI developers working with Power BI,
- Microsoft Certified: Azure Data Scientist Associate – for data scientists working with Azure Machine Learning service,
- Microsoft Certified: Azure AI Engineer Associate – for engineers and developers using Cognitive Services and Bot service.
- Learn systematically. Sounds easy, but in reality we all know how it is. Just make sure you allocate time and split your learning goals into reasonable milestones. Fortunately, today you can use the power of online learning resources to upskill quickly and on a daily basis (see the links above).
- Breadth first. Yes, specialization is important. But deep dive knowledge will come over time, along with your experience. Do not try to become an expert on each service right away. Rather focus on understanding the purpose and capabilities of each service and how to combine them together (see the next point).
- Know the limitations of the services and how the services integrate. This will become crucial when you step into the real-world projects. Rarely, there will be cases when you use a single service…
- Be open minded. The way to implement things in the cloud may differ from on-premises a lot. Also, keep in mind there are always multiple ways to implement your solutions in the cloud, so try different approaches, do not afraid to experiment (but make sure you experiment wisely, keeping an eye on the cost).
- Go beyond Data & AI services.
- Make sure you know Azure fundamentals (use learning paths at Microsoft Learn and great free online course by Adam Marczak). Get at least some basics of networking, security, identity and governance in Azure.
- Learn additional services for automation (Azure LogicApps, Azure Functions, Azure Automation) and security (Azure Key Vault).
- Put your interest into: DataOps (see an example), MLOps, CI/CD (see an example), Infrastructure-as-a-Code (see its benefits).
- Become familiar with Cloud Adoption Framework and Azure Well-Architected Framework to make your solutions for analytics aligned to the general best practices for cloud architecture and development.
- Understand pricing and learn how to optimize billing. Cost optimization and monitoring should be constant parts of the projects in Azure. Learn what you pay for when using a specific service. Use Azure Pricing Calculator to what cost your solutions can generate (additionally, you can play with Azure TCO Calculator to verify the TCO of your cloud infrastructure vs on-premises). Always think about possible savings (scale-down/scale-in, pause resources, reservations, and more).
- Practice, practice, practice… There is no other way to get hands-on experience. You simply have to experiment, try, run some benchmarks and tests. For that you will need an access to Azure subscription. You can start with free Azure account, but later on you probably will need some other way to get your “Azure playground” – for that you can use (ask your employer for) MSDN subscriptions, Visual Studio partner’s benefit or Azure Pass. Keep in mind that many labs at Microsoft Learn (see links above) offer free of charge sandboxes to complete the labs.
- Be curious of different vendors and open-source solutions. Yes, you heard me. Track what Google and AWS have in their clouds. Keep an eye on the open-source world (Apache ecosystem!). Observe fast growing and leading vendors of Big Data solutions that can run their platforms in the cloud (Databricks, Snowflake, Cloudera, Teradata, Dremio – to name a few of those I’m watching). Look for the inspirations, similarities and differentiators among platforms and services. Understand, how organizations can benefit by using the multi-cloud strategies or running 3rd party SaaS platforms on top of major clouds.
- Technology is just a piece of this puzzle. It does not matter, which role you focus on, there are other things than technology to learn. Data modeling (e.g. data warehouse design methodologies – Kimball, Inmon, Data Vault) and programming languages (e.g. Python, SQL, DAX) are good examples.
Grow as a company
If you are responsible for planning and execution of skilling processes and building analytics competencies in Azure of your company or your team, here are some advices for you:
- Leave people some space for self-development. There should be clear rules of how employee can plan learning during working hours.
- Provide tools and resources for learning. People should know what learning resources, programs and exam discounts are available and how to use them.
- Agree on learning goals and deliverables. No goal, no self-development. People should be accountable for working hours spent for their development. Building technical competencies should be a written plan with agreed goals (e.g. certifications) and deliverables (e.g. re-delivery of knowledge in a team) and the execution of this plan should be a matter of regular discussion between an employees and their bosses.
- Reward learners and achievers. Appreciate people for their efforts and learning performance. It doesn’t have to be a salary bonus, but make people proud of what they achieve.
- Consider gamification of the learning process. A healthy competition can motivate a lot and provide a lot of fun (e.g. a small gift for the most certified team in the company every quarter).
- Create Centers of Excellence (or Guilds, whatever you call them). Learning in a community is often much more effective. Build a knowledge sharing culture in the company. Make sure to standardize tools for communication.
- Make sure people contribute to the learning process of their peers. One of the major points of each senior employee’s development plan should be to participate (as a supporting peer or a mentor) in someone else’s learning process. Teaching and mentoring can build additional soft skills and also helps a lot in creating team spirit.
- Align the development plan of each team member to the strategy of your team or company. Here is the tough part. At the end of the day, it’s all about how well the skills of the team fit the holistic vision of the organization competencies. Make sure people understand how they contribute to the whole company’s/team’s success.
- Allow people to specialize. Do not build a monolith development plan for all team members. Let people choose whether they want to be more data engineers, data analysts, data scientists, and so on. Of course, there should be a clearly stated (written down) common knowledge everyone should have (e.g. Azure and data fundamentals).
- People learn better when they are inspired. I’ve heard a sentence once that “people prefer to work with a ‘general’ who does not sit in the trenches, but sets an example by his fighting attitude”. There is no better inspiration than leaders in the company setting the directions of development by their example.
Did I miss something? Let me know!
Wow, that was quite a blog post! Thank you for reading. Now, when you got so far, I have one more huge request for you: please, share your thoughts with me.
I realize, I’m biased in many ways. I was not “born in the cloud”. For me, moving my mindset towards public cloud as a first-choice for analytics has been a long process of unlearning some things I’ve learned when I worked with on-premises solutions and slowly adapting the advantages and capabilities that cloud services can bring. Yet, I’m still learning and I’d be glad to hear about your experiences and learnings.
I’m looking forward to reading your comments!