Author: Sasikanth

  • Retrieval-Augmented Generation (RAG): The Missing Link in Generative AI

    As someone who’s spent years designing AI infrastructure and large-scale distributed systems, I’ve seen firsthand the challenges businesses face when adopting generative AI at scale. One recurring issue? Generative models are only as good as the static data they’re trained on—a limitation that becomes glaringly obvious in dynamic, real-world applications. This is where Retrieval-Augmented Generation (RAG) steps in and reshapes the narrative.

    What is RAG, and Why Does It Matter?

    RAG combines the strengths of generative AI with retrieval-based systems to deliver responses that are not only creative but also grounded in the most relevant and up-to-date information. Unlike traditional generative models that rely solely on pre-trained data, RAG dynamically integrates external knowledge from databases, APIs, or documents at the time of inference.

    This hybrid architecture ensures generative AI systems stay relevant, accurate, and adaptable—qualities increasingly vital in high-stakes industries like healthcare, finance, and customer support.

    Where RAG Excels

    1. Dynamic, Domain-Specific Applications

    Imagine deploying an AI assistant for a healthcare provider. While a conventional model might fail to incorporate the latest clinical guidelines, a RAG-powered assistant retrieves current medical literature and tailors its responses to the patient’s case. This ensures accurate, actionable insights that are grounded in reality.

    2. Scalable Knowledge, Lower Costs

    Training generative models on vast, ever-changing datasets is both costly and inefficient. RAG sidesteps this issue by maintaining a lean generative model while offloading the knowledge component to external retrieval systems. This approach enables enterprises to manage terabytes of dynamic data without exorbitant costs.

    3. Reduced Hallucination Risks

    One of the most significant drawbacks of generative AI is its tendency to “hallucinate”—producing responses that are plausible yet false. By anchoring outputs to retrieved, verifiable data, RAG minimizes this risk, fostering trust and reliability in AI-generated content.

    How RAG Transforms Use Cases

    Having worked on AI and cloud solutions that demand both creativity and precision, I’ve seen RAG solve real-world challenges across industries. Here are some standout applications:

    Customer Support

    A RAG-powered chatbot can dynamically pull information from a company’s knowledge base, delivering personalized and contextually accurate responses. This not only enhances customer experience but also reduces reliance on expensive human intervention.

    Enterprise Search

    For organizations overwhelmed with unstructured data, RAG revolutionizes how information is accessed. Employees can ask natural language questions and receive pinpointed, document-backed answers, turning data overload into actionable insights.

    Healthcare Assistants

    In a past project, we integrated RAG to power a healthcare AI assistant that retrieved real-time data like lab results and clinical guidelines. This ensured responses were both patient-specific and clinically up-to-date, significantly improving care delivery.

    Challenges of Implementing RAG

    1. Data Quality

    The effectiveness of RAG depends on the quality and reliability of the retrieval system’s data source. Outdated or biased information leads to flawed outputs, making data curation a critical step in deployment.

    2. Latency

    Real-time retrieval and generation can introduce latency, particularly when dealing with massive datasets. To address this, optimizing retrieval engines like Elasticsearch or vector databases like Pinecone is crucial for maintaining speed and efficiency.

    Why RAG is the Future

    Generative AI has proven its potential, but it cannot solve dynamic, real-world problems in isolation. RAG bridges the gap between static pre-trained models and the dynamic, ever-evolving nature of knowledge in the real world. By pairing the generative capabilities of AI with retrieval-based systems, we can unlock a new era of scalable, reliable, and intelligent applications.

    Whether it’s revolutionizing enterprise search, improving customer interactions, or transforming healthcare, RAG offers a practical, scalable pathway to expand the scope of AI innovation.

    What’s Next? Let’s Collaborate

    As I continue exploring RAG’s untapped potential, I’m eager to hear your thoughts. What industries or problems do you think could benefit most from RAG? Let’s connect and exchange ideas—I’m always up for a deep dive into how we can push the boundaries of AI innovation.

  • How Enterprises Can Effectively Adopt Generative AI Technology

    In today’s rapidly evolving digital landscape, the adoption of generative AI technology is no longer a matter of “if,” but “when.” As a Solution Architect with a passion for staying ahead of technological trends, I’ve been closely observing how generative AI can transform enterprises across industries. From automating content creation to revolutionizing customer service and enhancing product design, the possibilities are vast. But why does generative AI matter so much to enterprises today?

    In an era where businesses are under constant pressure to innovate and stay ahead of the competition, generative AI offers a unique opportunity to transform operations, create new revenue streams, and deliver personalized customer experiences at scale. By leveraging the power of AI to generate content, code, designs, and more, enterprises can unlock new efficiencies, reduce costs, and drive growth in ways that were previously unimaginable.

    However, adopting generative AI technology isn’t just about plugging in the latest tool and expecting instant results. It requires a thoughtful, strategic approach to ensure successful integration and maximize the benefits. Here’s how I believe enterprises can effectively navigate the journey of adopting generative AI technology.

    Start with Assessment and Planning

    First and foremost, understanding your business objectives is crucial. It’s not just about jumping on the AI bandwagon because everyone else is doing it. The key is to align AI initiatives with your organization’s strategic goals. Whether you’re looking to enhance operational efficiency, improve customer experiences, or create new revenue streams, identifying the right use cases for generative AI is the foundation of success.

    In my experience, starting with pilot projects that target low-risk, high-impact areas can provide valuable insights and demonstrate the potential of AI within your organization. These initial steps help build confidence and create a roadmap for broader AI integration.

    Building the Foundation: Data, Infrastructure, and Talent

    Generative AI thrives on data, and the quality of your data can make or break your AI initiatives. Ensuring access to high-quality, diverse data and implementing robust data governance frameworks are essential steps. Additionally, decisions around AI infrastructure—whether to go with cloud, on-premises, or hybrid solutions—should be made with scalability and flexibility in mind.

    Equally important is building a skilled team. Recruiting AI specialists and data scientists, while also upskilling existing employees, fosters a culture of continuous learning and innovation. In the fast-paced world of AI, staying ahead requires a commitment to ongoing education and development.

    Implementation: From Pilot to Integration

    Once you’ve laid the groundwork, it’s time to move from pilot projects to full-scale implementation. Integrating generative AI into your existing business processes is where the real transformation happens. Seamless integration with current workflows, coupled with the automation of repetitive tasks, can unlock new efficiencies and streamline operations.

    Customizing AI models to fit specific business needs and continuously refining them based on performance metrics ensures that your AI solutions remain relevant and effective. It’s not a one-time effort but a journey of ongoing optimization.

    Governance, Ethics, and Change Management

    As with any powerful technology, generative AI comes with its own set of ethical considerations and governance challenges. Establishing an AI ethics committee and developing guidelines for responsible AI use are non-negotiables in today’s regulatory environment. Regular monitoring and audits help maintain compliance and mitigate risks.

    Change management plays a critical role in AI adoption. Clear communication about the benefits of AI, coupled with support for employees during the transition, can alleviate concerns and drive smoother implementation. It’s about fostering an AI-friendly culture that embraces innovation while addressing potential fears and misconceptions.

    Continuous Improvement and Long-term Vision

    The journey doesn’t end with the successful implementation of AI. Regularly monitoring AI performance against key metrics and gathering feedback from users are crucial for continuous improvement. Staying updated on the latest AI developments and being open to new opportunities allows your organization to remain competitive and innovative.

    Looking ahead, it’s essential to have a long-term AI roadmap that aligns with your overall business strategy. Anticipating future AI trends, investing in research and development, and maintaining flexibility in your AI strategy will ensure that your organization is well-prepared for the future.

    Collaboration and Partnerships: The Power of Together

    Finally, collaboration is key. Internally, fostering cross-functional teams and encouraging knowledge sharing can accelerate AI adoption. Externally, partnering with AI vendors, industry consortiums, and research institutions can provide access to cutting-edge technologies and best practices. It’s through these collaborations that enterprises can truly unlock the full potential of generative AI.

    Adopting generative AI technology is a strategic journey that requires careful planning, robust infrastructure, ethical considerations, and continuous learning. By following this approach, enterprises can not only harness the power of AI to drive growth and efficiency but also position themselves as leaders in the digital age.

    I’m excited to see how Generative AI will continue to evolve and transform the business landscape. If your organization is considering AI adoption, now is the time to start laying the groundwork. The future is bright, and the opportunities are endless. Let’s embrace the power of AI together.

  • How I passed – Oracle Cloud Infrastructure 2020 Architect Associate (1Z0-1072-20) exam

    The Oracle Cloud Infrastructure (OCI) 2020 Architect Associate exam is one of the demanding certifications for 2020/21 as the OCI is growing rapidly and even Gartner reported twice as OCI is the visionary in public cloud platforms. OCI is unique in its reliable infrastructure, cost and support. As like other cloud service providers, OCI is having many cool features like data guard, fault domains, compartments and their biggest asset is “Autonomous Database”

    Here are the topics I am covering in this post

    1. Certification details (Cost, Registration process and Exam topics)
    2. Examination preparation materials and my reviews
    3. My experience through exam preparation and exam day
    4. Syllabus in detail to prepare for the exam
    1. Certification details:- 
      1. The exam duration is 85 minutes for a total 60 questions, where you need to score 65% in order to pass the exam, i.e 39 correct answers will get you certification. The examination costs you USD 150. See below link for more details.

    https://education.oracle.com/oracle-cloud-infrastructure-2020-architect-associate/pexam_1Z0-1072-20

    • Examination consists of multiple choice questions only, where you will have 4 or more answers to select from for every question. Some questions you may need to select 1 or more suitable answers. There will be a question or two which shows a graphic and asks the question (I will explain more in next sections)
    • In order to register for the exam, you need to have an Oracle account and login through Certview to pay and schedule the exam. Below the link for registration

    https://catalog-education.oracle.com/pls/apex/f?p=1010:26:2076046940597.

    • You can attend the exam either at home or going to a PearsonVue test center and the exam price is the same in either of the cases. You will get the exam results immediately as pass or fail. Once you pass, you need to wait for 24 to 48 hours to receive your certification to your Oracle account. You will get the digital badge through Acclaim. 
    • Exam topics cover 5 major areas. Exam verifies your knowledge in architecting various solutions by leveraging OCI services in a best possible way (Low cost, resilient, reliable, high performance and secured)
    1. Identity
    2. Networking
    3. Compute
    4. Storage
    5. Databases
    1. Examination preparation materials and my reviews:- Below the learning subscription from Oracle University. This subscription consists of workshop, examination preparation, practice test and exam voucher but it will cost more than 3000 USD for the entire package. I would recommend go through the freely available option in this subscription those are actual workshop and practice test only

    https://learn.oracle.com/ols/learning-path/become-oci-architect-associate/35644/75658

    This course will help you understand the various OCI services like compute, network, storage, containers, identity, security etc but will not provide the actual guidance to clear the examination like best practices, architectural patterns, design considerations, lessons learned, various use cases and time/situation based decisions which are required to pass the examination. I will cover how you can gain such knowledge in the upcoming sections

    b.  The Official OCI documentation: Below the link for official OCI documentation that covers entire OCI services and features in depth.  

    https://docs.oracle.com/en-us/iaas/Content/home.htm

    I strongly recommend reading the Compute, IAM, Networking, Security, Storage, Databases sections in order to better prepare for the exam.

    c.  The book – “Architect Associate All-in-One Exam Guide (Exam 1Z0-1072)” published by  ‘McGraw Hill’, authored by Roopesh Ramklass. I also strongly recommend this book as it provides over all details of each OCI service that is required for the associate architect exam. 

    Note:- Don’t miss the “Tips” section in this book which is completely examination oriented. These tips are also a game changer to pass the exam.

    d. The Youtube Channel:- Below the link for the Youtube official OCI channel

    https://www.youtube.com/channel/UC60OcDzeEtn194-UPYNJs8A

    There are great videos especially “Networking” and “Core Concepts” sections, which you can refer while preparing for the exam. Check other videos as well as they cover out of the box concepts in adopting cloud/OCI and other industry related information.

    e. Explore and make your hands dirty with OCI free tier account. Practicing OCI will make you perfect in preparation to the exam. Below URL is for free tier OCI

    https://www.oracle.com/cloud/free/

    1. My experience through exam preparation and exam day: I am a 6x AWS certified, CCSK and CCSP with 7 years of cloud computing experience in Cloud Security, Adoption, Migrations, Automation and DevSecOps, so this over all experience helpmed at least 30 to 40 % of OCI architect associate exam preparation. I have undergone all above said 5 different types of learning ways for about 35 days (each day 1+ hour and 2 to 3 hours during weekends). During the normal  business days, it is a little difficult to focus on studies, being full time employed and being a fulltime husband, father 🙂 

    I have booked the exam at a local PearsonVue testing center (Knowing my kids may disturb me if I take the exam from home 🙂 ). I am ontime to the test center and finished check-in formalities and directed to my hot seat. 

    The very first question I saw,  I thought I was going to fail the exam, as the question was two paragraphs and took 3+ mins to read and understand. In fact I even don’t have the right answer. My Tip here is – Don’t Panic if you don’t know the answer. If I don’t know the answer I try to guess the best possible option with my experience in Cloud Architectures. However there are not many questions that are lengthy but most of the questions are two to 3 lines of questions only. 

    There the below types of questions I encountered 

    1. Question with multiple choice answers which you have to pick one right answer

    2. Question with multiple choice answers which you have to pick more than one right answer

    3. Question with a graphic with multiple choice answers which you have to pick one right answer.

                          Note: There are no “True” / “False” questions that I see in my exam

     Time is also an essential part to consider, I have 5 mins left to answer 9 questions in my exam.  So you need to be cautious when you are spending more time on questions you are uncertain. Keep them in review stage and move on to the next question, so you can revisit the review marked questions in the last and think about them.  Finally I have submitted the exam right before it auto closes the session and I see the result as “Pass”. Hurray, I did it. Remember the core secrets of my success are my previous experience, OCI official documentation reading and the All-in-One book (specially the tips in the book).

    • Syllabus in detail to prepare for the exam: Below individual topics and respective areas to prepare well, which are based on my exam experience
      • Identity
        1. OCI Administrators, groups and their permissions in details
        2. Compartments
        3. Dynamic groups and auth tokens
        4. API endpoints authentication
        5. 3rd party active directory federations
        6. Permissions to OKE cluster users
        7. Tag based access control
        8. Resource tagging
      • Compute
        1. What are the regional based/availability domain based resources and non regional based resources
          1. Example : ‘Compute Image’ is regional
        2. Different types of instance shapes and use cases like what shape to use given the situation
        3. OCI OKE and replications
        4. Patch management of operating systems/images
        5. Compute images
        6. Auto Scaling and policies  
        7. Kubectl commands
      • Networking
      1. Load balancers, their situational behaviors and troubleshooting LBaaS issues
      2. VCN default components
      3. Route tables, Subnets and multiple gateways (IG, NAT, DRG, SG etc)
      4. CIDRs
      5. DR architectures
      6. VCN peering
      7. Primary/Secondary VNICs (Virtual Network Interface Cards)
    • Storage
      1. Boot volumes and disk performances
      2. Object versioning, namespaces and retention rules
      3. Object storage sizing limits and multi part uploads
      4. Block volume resizing
      5. Object storage permissions
      6. File storage, NFS exports
      7. Object storage encryption
      8. Volume backups and restores
    • Databases
      1. DB migration methods to ATP (Autonomous Transaction Processing)
      2. Data Guard and its use cases
      3. DB/activity Log monitoring
      4. RAC Database systems
      5. Database host connectivity issues 
      6. Vault, encryption keys
      7. High Level details about OCI Autonomous data warehouse 

    Now you have everything you need to know, and it’s time for you to plan (based on your previous experience in Cloud/OCI), what to study, how to study and how long to study. Let me know if you have any additional questions. All the best !

  • Planning for Cloud Migration ? Consider 6R’s Strategy

    Is your migration to the cloud being a daunting process ?. Then its time to make it done in the right way to ensure it supports your business case and delivers the value that is expected since migration infrastructure, data and applications will affect both people and processes that rely upon them.

    Planning is key to the potential successful migration to the cloud and you need to create a migration strategy by identifying the potential migration options, understanding the inter dependencies between applications, data and infrastructure.

    Effective planning of cloud migration have 6 R best practices that enterprises are successfully practicing. But make sure to discuss these points with application/data owners to prioritize the assets to migrate with different strategies.

    6rs

    Rehost (Lift & Shift, Fork Lift) – simply ‘clone’ your servers and move them to the cloud provider’s Infrastructure. The Cloud Provider will then manages the underlying hardware and hypervisor infrastructure, and you continue to manage the same operating system and installed applications.

    Repurchase – Many commercial applications are now available as Software as a Service (SaaS):Example Salesforce. So it may be easy to you move away from managing applications and infrastructure and adopt SaaS deployment model.

    Replatform – Moving your application currently relying on legacy infrastructure to the new cloud based infrastructure. You are only changing underlying services, while maintaining the core application code, so the architecture of your applications will not change.

    Refactor – Make changes to your application code to leverage cloud native services. Simply called “application modernization”. Example : It may be that you want to move away from server-based applications to leverage cloud provider serverless functionality

    Retain – Your on-prem applications may be newly developed with newly invested money and other resources may not satisfy the business case to move to the cloud or the application dependencies with their software licenses may not yet support the public cloud platforms. So you will retain those applications back in on-prem.

    Retire – You must classify what applications/infrastructure needs cloud migration and get rid of the aged/no business required assets. Simply retire the apps and infra instead moving them to cloud by spending time and cost.

  • What is “Minimum Viable Cloud (MVC)” Environment and Approach?

    Capture

    As I always define the benefits of cloud computing in 3 ways

    1. Business Strategy (Shared responsibilities between consumer & CSP, fearless Updates & Upgrades, Global collaboration, Technological transformation & ahead in the competition)

    2. Technological Flexibility (Infra Scalability, storage options, deployment choices [IaaS, PaaS, SaaS], features and tools)

    3. Operational Efficiency (Global accessibility, Infra/App/Data security, agility that speeds to market, CapX to OpeX, OnDemand that pay as you go)

    We know the above cloud computing benefits for many years by now, but to my experience being a Cloud Solution Consultant, handled many customers across the globe and below the consolidated list of questions from the C level customer executives that I met. They are all about cloud adoptions and migrations

    1. How public cloud is better than On-premise?

    2. How complex the migration of our corporate data center to public cloud?

    3. Are the current applications deployable to cloud? and how can we assess them?

    4. What is the cost for the infra/app migration?

    5. What will be the TCO and ROI benefits if we move to the cloud?

    6. Is public cloud secured than on-premise?

    7. How our competitors using public cloud?

    That is why I was engaged? Yes, what I could say to these precious clients is

    As we migrate to the cloud with some essential steps which can evaluate the situations and provide a detail which can address the above questionnaire and a road map to the future public cloud establishments.

    I always say Initiate, Assess, Approve, Build & Migrate

    Initiate: Enterprises have to initiate the plan of moving to Public Cloud as per their desired vision by bringing all the technical/Business stakeholders into one platform and kick start the planned steps (Something like — engage a strategic partner, build a CoE team, acquire the knowledge etc…)

    Assess: Do an assessment of current state of infrastructure, applications, data, security & networking of on-premise over public cloud platforms

    Approve: Getting the plans approved is typical for any organization. Plan must adhere to the organizational policies, regulations and compliance requirements. Business needs to approve the cloud adoption plan

    Build: Build a Demo/POC environment in Public Cloud which can support the infra/app/data hosting

    Migrate: Move the simple workloads first to the cloud and test the benefits.

    Later leverage the tools & services of particular CSP to build a secure, reliable, efficient cloud platform which can support the 6 Rs (Re-hosting, Re-platforming, Repurchasing, Refactoring, Retire & Retain)

    So now I advised my customer the benefits of cloud computing and the strategic steps to take to migrate to the cloud. But what is the approach or method to follow?

    That is where the “Minimum Viable Cloud” approach comes to ……

    There are many use cases for an organization to move their workloads to Public cloud platform. Under any circumstances or use cases, building a secured public cloud platform which can satisfy the requirements to launch at least one application and engages all the key stakeholders of that organization, where they can explore, experiment that platform to their need….is called the minimum viable cloud environment and the approach is nothing but MVC approach.

    1. This cloud platform should demonstrate the viabilities of cloud services

    2. This cloud platform should be secured in terms aligning with Cloud security alliance guideline, and ensures the infrastructure/application/Data/Network security

    3. This cloud platform may have Agile mindset, Automation, DevSecOps, Ci/CD of Infra/App pipelines, Virtual/Physical connectivity with on-premise data center & extended corporate user access.

    During this MVC approach, customer should be ready with the details to satisfy above stated MVC characteristics.

    Below few use cases to MVC approach

    1. Deploying a web page with high availability and geographical accessibility

    2. Re-hosting an enterprise application with logging & monitoring in place

    3. Migrating a Sybase DB to Cloud MSSQL

    4. Re-engineering a database to host in public cloud with high security and low cost

    5. Building a multi environment platform with CI/CD for faster application releases.

    MVC approach is a secure and comprehensive way of cloud adoption!!!

  • How to pass CCSKv4 (Certificate of Cloud Security Knowledge) in your first attempt

    ccskThe “Certificate of Cloud Security Knowledge (CCSK)” is the first professional certification in Cloud Security Industry released in 2011 and gained the momentum very soon. If you see the top vendor neutral certifications for Cloud Security, then CCSK & CCSP (Certified Cloud Security Professional) stands ahead of other certifications.

    In 2017 the CCSK version has changed from version 3 to 4 and the current examination will test the knowledge over version 4 only. The CCSK is managed by “Cloud Security Alliance” and CCSP is being managed by (ISC). CCSK helps individual cloud security engineers and architects, employers, cloud service providers and consulting firms with to develop the best security programs related to globally accepted standards and further build and maintain the secure cloud business.

    How to prepare for the CCSK exam:

    There could be multiple ways to prepare for the exam, but below my thoughts may ease the preparation and clear the certification at the first attempt.

    Step 1: – Read and understand the Security guidance provided by the Cloud Security Alliance. This can be freely downloaded from CSA website else you can find one from the preparation kit that I have shared in this post. This guidance book consists of 152 pages with 14 security domains defined. Below the information that how I received the examination questions across these domains but may vary slightly case by case.

    Domain 1: Cloud Computing Concepts and Architectures (6 questions)

    Domain 2: Governance & Enterprise Risk Management (2 questions)

    Domain 3: Logical Issues, Contracts & Electronic Discovery (3 questions)

    Domain 4: Compliance & Audit Management (3 questions)

    Domain 5: Information Governance (2 questions)

    Domain 6: Management Plane & Business Continuity (4 questions)

    Domain 7: Infrastructure Security (6 questions)

    Domain 8: Virtualization & Containers (5 questions)

    Domain 9: Incident Response (4 questions)

    Domain 10: Application Security (6 questions)

    Domain 11: Data Security & Encryption (6 questions)

    Domain 12: Identity, Entitlement & Access Management (3 questions)

    Domain 13: Security as a Service (SecaaS) (2 questions)

    Domain 14: Related Technologies (1 question)

    I strongly recommend reading this document at least 2 times thoroughly. There are at least 80% of the questions are direct statements from this book. Below is an example

    Question: – Immutable workloads make it faster to roll out updated versions because applications must be designed to handle individual nodes going down

    1. True
    2. False

    The right answer is “True”.

    You can see the same statement written in the Security Guidance Book page 86, below the screenshot of the same.

    pic

    Step 2: – Read and understand the “ENISA” (European Network & Information Security Agency) document of benefits, risks and recommendations for information security (Mainly the ‘Top Security Risks’ Section)

    I received 3 questions from this section in the CCSK exam. I have shared this document in the preparation kit as well in this post.

    Step 3: – Walk through the Cloud Control Matrix (CCM v 3.0.1)

    The Cloud Controls Matrix is a baseline set of security controls created by the Cloud Security Alliance to help enterprises assess the risk associated with a cloud service provider

    I have received 4 questions from this section and shared the CCM document in this post as part of the preparation kit.

    Preparation kit : Below the link for the preparation kit

    https://github.com/spadigala/CCSKv4

    Examination Pattern, Tips and Cautions

    Exam Pattern: – CCSK exam can be registered through the below link by signing up.

    https://ccsk.cloudsecurityalliance.org/en/login

    • Once you sing up, you need to buy the exam with credit card or paypal which costs USD 395. You will receive 2 attempts to acquire the certification and if you pass the exam on your first attempt, then the second attempt will be wiped out.
    • CCSK exam is very flexible that you can attend this from your home.
    • CCSK exam consists of 60 questions (Multiple choice and True/False) with 90 minutes of time to complete. You can ‘Mark for review’ the question you like to revisit, but if you mark the question for review and unable to review them in the given 90 minutes time, then these questions will not have calculated to the results

    Exam Tips: –

    I suggest attending the exam with a laptop or desktop having 2 monitors. One monitor you can attempt the exam and another monitor you can open the guidance documents like ENISA, CCM etc. where you can search for the words listed in the question at these guidance documents for references.

    Cautions: –

    Time is very crucial in this exam as many thinks that this exam can be attended at home by keeping the required material open to read and search which a waste of time in many cases as you may search and answer the 50% questions but will fail to attempt the remaining 50% as the time lapses.

    Available trainings from the market

    • CSA (The Cloud Security Alliance) provides the information regarding Classroom/Virtual trainings across the globe. You can search the same in below link. But keep in mind that these trainings are very expensive as I could see the minimum training cost is USD 1945 for a 3-day training program.

    https://intrinsecsecurity.com/training/courses/ccsk/

    • There is an OnDemand online training (Recorded video) provided by https://intrinsecsecurity.com/ but still it will cost you USD 995. But one good thing is that you will receive an examination voucher to redeem where you will save USD 395
    • You can buy and practice the questions from Whizlabs (https://www.whizlabs.com). Note that these questions are to self-assessment purpose only and not the examination dumps. Do not expect more than 5% of questions in real exam from these. But this practice questions can help you prepare for the exam by building you the confidence on the required subject

    In my view, if you can undergo the above stated 3 Step examination preparation for about a month of time, then you do not need any of these trainings as these trainings help you only visualize the content than reading the guidance document.

    Once you finish the exam, then you will receive the results immediately either pass or fail and detailed report of the results containing each domain and questions received and answered correctly. You can download the pdf version of the certificate immediately after you pass the exam.

    All the best to your exam and let me know how my plan helped you for the exam.

  • What is DevSecOps

    In late 2016, when I was stuck at building a CI/CD pipeline with security controls towards a PaaS solution to one of the largest UK bank, I used to think that there should be a philosophical approach (in every organization) to make everyone feel responsibility of security in what they are building and innovating. In late 2017, while I was engaged with a large insurance client, I was requested to identify and build security capabilities as an interim solution to an App/Infra CI/CD Pipeline, again i got stuck with lack of blue prints and playbooks for each service in pipeline to integrate security. For example a Jenkins playbook which contains a detailed operating reference with security in place (For CI solution).

    I seriously though its a philosophical/cultural shift to an organization to enhance their DevOps practice with security. And in early 2018, I learned the term – DevSecOps, which fulfilled my English to my previous experiences and thoughts. Yes the name defines it. Practice your DevOps culture with an attitude of Security as Service (SecaaS) and Security as a Code (SecaaC). Both SecaaS/SecaaC defines your organizations operational and engineering efficiencies. Thats ‘DevSecOps’

    DevOps is not only about developing and operating business and teams. As I always say adopt DevOps culture and agile mindset, but if you want to see a better outcome of those two attitudes, then security must also play an integrated role in the full life cycle of your infrastructure and applications. Traditionally Developers, Operations team and Security team are different. Now its like a one man army – The “DevSecOps Engineer”

    As everyone know that DevOps/Agile will get your code developed and released faster and frequent, but with lack of security standards (Or legacy standards) can ruin the business goals and fails the DevOps approach.

    I suggest the developers, ops guys and sec engineers to maintain short and frequent development cycles, and integrate security measures (try to reduce/minimize Operational disturbances by doing so),
    speed up innovative technologies like containers and microservices, and all the program manage/scale the DevSecOps approach between commonly isolated teams—this is a tall order for any organization.

    Next time, I will comeback here to post more glimpses of pipeline (Infra/App) components with security integrated and will publish some playbooks too.devsec

  • AWS IAM – User/Policy Management (Industry Successful Practice with Job functions)

     Temporary Security Credentials (The Successful Market Practice)

    Temporary security credentials are time based accesses for AWS Users. They can be configured to last for anywhere from a few minutes to several hours (Min of 1 and Max of 2 hours as best practice). After the credentials expire, AWS no longer recognizes them or allows any kind of access from API requests made with them.

    Temporary security credentials are not stored with the user but are generated dynamically and provided to the user when requested. When (or even before) the temporary security credentials expire, the user can request new credentials, as long as the user requesting them still has permissions to do so.

    Advantages: –

    Temporary credentials are useful in scenarios that involve identity federation, delegation, cross-account access, and IAM roles. Also

    • You do not have to distribute or embed long-term AWS security credentials with an application.
    • You can provide access to your AWS resources to users without having to define an AWS identity for them. Temporary credentials are the basis for roles and identity federation.
    • The temporary security credentials have a limited lifetime, so you do not have to rotate them or explicitly revoke them when they’re no longer needed. After temporary security credentials expire, they cannot be reused. You can specify how long the credentials are valid, up to a maximum limit.

    How to implement?

    Integrate with corporate directory

    IAM can be used to grant your employees and applications federated access to the AWS Management Console and AWS service APIs, using your existing identity systems such as Microsoft Active Directory. You can use any identity management solution that supports SAML 2.0, or feel free to use one of our federation samples (AWS Console SSO or API federation).

    Enterprise identity federation – You can authenticate users in your organization’s network, and then provide those users access to AWS without creating new AWS identities for them and requiring them to sign in with a separate user name and password. This is known as the single sign-on (SSO) approach to temporary access. AWS STS supports open standards like Security Assertion Markup Language (SAML) 2.0, with which you can use Microsoft AD FS to leverage your Microsoft Active Directory. You can also use SAML 2.0 to manage your own solution for federating user identities. For more information, see About SAML 2.0-based Federation.

    • Custom federation broker– You can use your organization’s authentication system to grant access to AWS resources.
    • Federation using SAML 2.0– You can use your organization’s authentication system and SAML to grant access to AWS resources.

    Enable Federated API Access to your AWS Resources for up to 12 hours Using IAM Roles

    Applications and federated users can complete longer running workloads in a single session by increasing the maximum session duration up to 12 hours for an IAM role. Users and applications still retrieve temporary credentials by assuming roles using AWS Security Token Service (AWS STS), but these credentials can now be valid for up to 12 hours when using the AWS SDK or CLI. This change allows your users and applications to perform longer running workloads, such as a batch upload to S3 or a CloudFormation template, using a single session. You can extend the maximum session duration using the IAM console or CLI. Once you increase the maximum session duration, users and applications assuming the IAM role can request temporary credentials that expire when the IAM role session expires.

    How to configure the maximum session duration for an existing IAM role to 4 hours (maximum allowed duration is 12 hours) using the IAM console. I’ll use 4 hours because AWS recommends configuring the session duration for a role to the shortest duration that your federated users would require accessing your AWS resources. I’ll then show how existing federated users can use the AWS SDK or CLI to request temporary security credentials that are valid until the role session expires.

    Prerequisites

    Here we use a federation example. If there is an existing identity provider, you might have enabled federation by using SAML to allow your users to access your AWS resources. This post assumes you have created an IAM role for a third-party identity provider (Onprem IAM) that defines the permissions for your federated users. You could also have configured SAML-based federation for API access to AWS. For my example, I’ve chosen to configure SAML-based federation using Microsoft Active Directory Federation Service (ADFS) as the identity provider (IdP).

    Configure the maximum session duration for an existing IAM role to 4 hours

    Assume we have an existing IAM role called ADFS-Production that allows your federated users to upload objects to an S3 bucket in your AWS account. You want to extend the maximum session duration for this role to 4 hours. By default, IAM roles in your AWS accounts have a maximum session duration of one hour. To extend a role’s maximum session duration to 4 hours, follow the steps below:

    1. Sign in to the IAM console.
    2. In the left navigation pane, select Roles and then select the role for which you want to increase the maximum session duration. For this example, I select ADFS-Production and verify the maximum session duration for this role. This value is set to 1 hour (3,600 seconds) by default.
    3. Select Edit, and then define the maximum session duration.
      no3

    4. Select one of the predefined durations or provide a custom duration. For this example, I set the maximum session duration to be 4 hours.

    5. Select Save changes.

    Alternatively, you can use the latest AWS CLI and call Update-Role to set the maximum session duration for the role ADFS-Production. Here’s an example to set the maximum session duration to 14,400 seconds (4 hours).

    $ aws iam update-role -–role-name ADFS-Production -–max-session-duration 14400

    Now that you’ve successfully extended the maximum session for your IAM role, ADFS-Production, your federated users can use AWS STS to retrieve temporary credentials that are valid for 4 hours to access your S3 buckets.

    Access AWS resources with temporary security credentials using AWS CLI/SDK

    To enable federated SDK and CLI access for your users who use temporary security credentials, you might have implemented the solution described in the blog post on How to Implement Federated API and CLI Access Using SAML 2.0 and AD FS. That blog post demonstrates how to use the AWS Python SDK and some additional client-side integration code provided in the post to implement federated SDK and CLI access for your users. To enable your users to request longer temporary security credentials, you can make the following changes suggested in this blog to the solution provided in that post.

    When calling AssumeRoleWithSAML API to request AWS temporary security credentials, you need to include the DurationSeconds parameter. The value of this parameter is the duration the user requests and, therefore, the duration their temporary security credentials are valid. In this example, I am using boto to request the maximum length of 14,400 seconds (4 hours) using code from the How to Implement Federated API and CLI Access Using SAML 2.0 and AD FS post that I have updated:

    # Use the assertion to get an AWS STS token using Assume Role with SAML
    conn = boto.sts.connect_to_region(region)
    token = conn.assume_role_with_saml(role_arn, principal_arn, assertion, 14400)

    By adding a value for the DurationSeconds parameter in the AssumeRoleWithSAML call, your federated user can retrieve temporary security credentials that are valid for up to 14,400 seconds (4 hours). If you don’t provide this value, the default session duration is 1 hour.

    Governance over AWS IAM

    Monitor activity in your AWS account

    • The IAM best practice, monitor activity in your AWS account, encourages you to monitor user activity in your AWS account by using services such as AWS CloudTrailand AWS Config. In addition to monitoring usage in your AWS account, you should be aware of inactive users so that you can remove them from your account. By only retaining necessary users, you can help maintain the security of your AWS account.
    • Use different access keys for different applications. Do this so that you can isolate the permissions and revoke the access keys for individual applications if an access key is exposed. Having separate access keys for different applications also generates distinct entries in AWS CloudTraillog files, which makes it easier for you to determine which application performed specific actions.
    • Rotate access keys periodically. Change access keys on a regular basis. Below the steps to rotate access keys without interrupting the application (AWS CLI)

              Rotate access keys without interrupting your applications (API, CLI, PowerShell)

    1. While the first access key is still active, create a second access key, which is active by default. At this point, the user has two active access keys.
      • AWS CLI: aws iam create-access-key
      • Tools for Windows PowerShell: New-IAMAccessKey
      • AWS API: CreateAccessKey
    2. Update all applications and tools to use the new access key.
    3. Determine whether the first access key is still in use:
      • AWS CLI: aws iam get-access-key-last-used
      • Tools for Windows PowerShell: Get-IAMAccessKeyLastUsed
      • AWS API: GetAccessKeyLastUsed

    One approach is to wait several days and then check the old access key for any use before proceeding.

    1. Even if step Step 3 indicates no use of the old key, we recommend that you do not immediately delete the first access key. Instead, change the state of the first access key to Inactive.
      • AWS CLI: aws iam update-access-key
      • Tools for Windows PowerShell: Update-IAMAccessKey
      • AWS API: UpdateAccessKey
    2. Use only the new access key to confirm that your applications are working. Any applications and tools that still use the original access key will stop working at this point because they no longer have access to AWS resources. If you find such an application or tool, you can switch its state back to Activeto reenable the first access key. Then return to step Step 2 and update this application to use the new key.
    3. After you wait some period of time to ensure that all applications and tools have been updated, you can delete the first access key.
      • AWS CLI: aws iam delete-access-key
      • Tools for Windows PowerShell: Remove-IAMAccessKey
      • AWS API: DeleteAccessKey
    • Remove unused access keys. If a user leaves your organization, remove the corresponding IAM user so that the user’s access to your resources is removed. To find out when an access key was last used, use the GetAccessKeyLastUsedAPI (AWS CLI command: aws iam get-access-key-last-used).

    Required User Roles (Industry Practice – Key Roles)

    1. AWS Manager – Billing & Projects

               Administrator

                AWS managed policy name: AdministratorAccess

               Use case: This user has full access and can delegate permissions to every service

    and resource in AWS.

    Policy description: This policy grants all actions for all AWS services and for all resources in the account.

              Billing

               AWS managed policy name: Billing

              Use case: This user needs to view billing information, set up payment, and authorize payment.

    The user can monitor the costs accumulated for each AWS service.

    Policy description: This policy grants permissions for managing billing and costs.

    The permissions include viewing and modifying both budgets and payment methods.

     

    1. AWS Auditor – Security & Compliance

              Security Auditor

              AWS managed policy name: SecurityAudit

              Use case: This user monitors accounts for compliance with security requirements.

    This user can access logs and events to investigate potential security breaches or

    potential malicious activity.

              Policy description: This policy grants permissions to view configuration data for many AWS services

    and to review their logs.

    1. AWS Developer – APIs & Tools Power User

              Developer Power User

               AWS managed policy namePowerUserAccess

               Use case: This user performs application development tasks and can create and configure resources

    and services that support AWS aware application development.

              Policy description: This policy grants permissions to view, read, and write permissions for a variety of AWS services intended for application development, including Amazon API Gateway, Amazon AppStream, Amazon CloudSearch, AWS CodeCommit, AWS CodeDeploy, AWS CodePipeline, AWS Device Farm, Amazon DynamoDB, Amazon Elastic Compute Cloud, Amazon Elastic Container Service (ECS), AWS Lambda, Amazon RDS, Amazon Route 53, Amazon Simple Storage Service (S3), Amazon Simple Email Service (SES), Amazon Simple Queue Service (SQS), and Amazon Simple Workflow Service (SWF).

     

    1. AWS Engineer – Monitoring & Operations Support

               System Administrator

                AWS managed policy name: SystemAdministrator

                Use case: This user sets up and maintains resources for development operations.

                Policy description: This policy grants permissions to create and maintain resources across a large variety of AWS services, including AWS CloudTrail, Amazon CloudWatch, AWS CodeCommit, AWS CodeDeploy, AWS Config, AWS Directory Service, Amazon EC2, AWS Identity and Access Management, AWS Key Management Service, AWS Lambda, Amazon RDS, Route 53, Amazon S3, Amazon SES, Amazon SQS, AWS Trusted Advisor, and Amazon VPC. This job function requires the ability to pass roles to AWS services. The policy grants iam:GetRole and iam:PassRolefor only those roles named in the following table.

    Optional IAM service roles for the System Administrator job function

    Use case Role name (* is a wildcard) Service role type to select AWS managed policy to select
    Allow apps running in EC2 instances in an Amazon ECS cluster to access Amazon ECS ecr-sysadmin-* Amazon EC2 Role for EC2 Container Service AmazonEC2ContainerServiceforEC2Role
    Allow a user to monitor databases rds-monitoring-role Amazon RDS Role for Enhanced Monitoring AmazonRDSEnhancedMonitoringRole
    Allow apps running in EC2 instances to access AWS resources. ec2-sysadmin-* Amazon EC2 Sample policy for role that grants access to an S3 bucket as shown in the Amazon EC2 User Guide for Linux Instances; customize as needed
    Allow Lambda to read DynamoDB streams and write to CloudWatch Logs lambda-sysadmin-* AWS Lambda AWSLambdaDynamoDBExecutionRole

     

    1. AWS Administrator – Databases & Storage

              Database Administrator

               AWS managed policy nameDatabaseAdministrator

               Use case: This user sets up, configures, and maintains databases in the AWS Cloud.

               Policy description: This policy grants permissions to create, configure, and maintain databases.

    It includes access to all AWS database services, such as Amazon DynamoDB,

    Amazon ElastiCache, Amazon Relational Database Service (RDS),

    Amazon Redshift, and other supporting services. This policy supports the ability to

    pass roles to AWS services. The policy grants iam:GetRole and iam:PassRole for only those roles

    named in the following table. For more information,

    Optional IAM service roles for the Database Administrator job function

                           Use case Role name

    (* is a wildcard)

    Service role type Select this AWS managed policy
    Allow the user to monitor RDS databases rds-monitoring-role Amazon RDS Role for Enhanced Monitoring AmazonRDSEnhancedMonitoringRole
    Allow AWS Lambda to monitor your database and access external databases rdbms-lambda-access Amazon EC2 AWSLambdaFullAccess
    Allow Lambda to upload files to Amazon S3 and to Amazon Redshift clusters with DynamoDB lambda_exec_role AWS Lambda Create a new managed policy as defined in the AWS Big Data Blog
    Allow Lambda functions to act as triggers for your DynamoDB tables lambda-dynamodb-* AWS Lambda AWSLambdaDynamoDBExecutionRole
    Allow Lambda functions to access Amazon RDS in a VPC lambda-vpc-execution-role Create a role with a trust policy as defined in the AWS Lambda Developer Guide AWSLambdaVPCAccessExecutionRole

     

    Allow AWS Data Pipeline to access your AWS resources DataPipelineDefaultRole Create a role with a trust policy as defined in the AWS Data Pipeline Developer Guide AWSDataPipelineRole
    Allow your applications running on Amazon EC2 instances to access your AWS resources DataPipelineDefaultResourceRole Create a role with a trust policy as defined in the AWS Data Pipeline Developer Guide AmazonEC2RoleforDataPipelineRole