Identity and access management is foundational to modern cybersecurity. Learn how to safeguard systems against unauthorized access and more.
Developer Experience: Demand to support engineering teams has risen, and there is a shift from traditional DevOps to workflow improvements.
AI-Powered Professor Rating Assistant With RAG and Pinecone
How to Scale Elasticsearch to Solve Your Scalability Issues
Developer Experience
With tech stacks becoming increasingly diverse and AI and automation continuing to take over everyday tasks and manual workflows, the tech industry at large is experiencing a heightened demand to support engineering teams. As a result, the developer experience is changing faster than organizations can consciously maintain.We can no longer rely on DevOps practices or tooling alone — there is even greater power recognized in improving workflows, investing in infrastructure, and advocating for developers' needs. This nuanced approach brings developer experience to the forefront, where devs can begin to regain control over their software systems, teams, and processes.We are happy to introduce DZone's first-ever Developer Experience Trend Report, which assesses where the developer experience stands today, including team productivity, process satisfaction, infrastructure, and platform engineering. Taking all perspectives, technologies, and methodologies into account, we share our research and industry experts' perspectives on what it means to effectively advocate for developers while simultaneously balancing quality and efficiency. Come along with us as we explore this exciting chapter in developer culture.
Apache Cassandra Essentials
Identity and Access Management
The improvement of artificial brainpower (artificial intelligence) has improved many fields, including digital protection. Notwithstanding, this mechanical improvement is a two-sided deal. While computerized reasoning brings many advantages, it also empowers cybercriminals to send off progressively complex and disastrous assaults. One of the most upsetting viewpoints is using AI in ransomware assaults. These AI-controlled assaults are robotized at different levels and find unpretentious procedures to make them greener, more complex, and more impressive. Most importantly, the danger scene quickly develops, creating more difficulties for people and associations. This blog investigates how computer-based intelligence is reshaping the network protection ramifications of ransomware assaults and the procedures expected to guard against this developing danger. The Rise of AI in Cyber Attacks Computerized reasoning (artificial intelligence) and machine insight have upset many fields, and the field of cyberattacks is no exception. Generally, cybercriminals have depended on manual methods to break into tedious frameworks that require a ton of expertise and exertion. As it may, AI aggressors can computerize and further develop their assault strategies, making their assaults more refined and complex to recognize. Computer-based intelligence calculations can rapidly handle a lot of information and distinguish examples and weaknesses that people can't identify within a sensible measure of time. This component permits cybercriminals to send off additional exact and designated assaults, improving the probability of achievement. Computer-based intelligence is utilized to mechanize many cyberattacks, including robotized undercover work where aggressors assemble data about their objectives. AI calculations can use public information from web-based entertainment and different sources to make profiles of potential objective casualties. This data is used for exceptionally altered phishing messages that are, in fact, more solid than customary spam. Computer-based intelligence can likewise be used to control hacking endeavors progressively to enhance the objective’s reaction or conduct and improve the probability of a break. AI improves malware capabilities. AI can detect the impact of malware from the environment by creating miles around and adapting to avoid detection using conventional security measures. For example, AI can identify malware as it travels miles around the arena and delay its execution to avoid detection. This means that AI can identify malware and delay its execution to avoid detection, thereby making AI-driven cyber-attacks more sophisticated and challenging to defend against. These developments make AI-driven cyber-attacks more sophisticated, presenting an essential scenario for cyber security professionals. As the next generation of AI is said to transform the methods used by cybercriminals, new generations of cyber threats are exposed, and the latest security is required. AI-Enhanced Phishing Phishing, a fraudulent attempt to obtain sensitive information such as usernames, passwords, and credit card details by disguising it as a trustworthy entity in an electronic communication, has long been a cornerstone of cyber-attacks. However, integrating AI has elevated this threat to new heights. The delivery of a phishing attack involves sending mass messages with generic messages in hopes of tricking specific recipients into revealing sensitive information. As people and communities become more aware and cautious, these practices become less potent. AI algorithms can analyze publicized social media profiles and large amounts of information from various online sources to create legitimate members' personal profiles. These email partners mimic how friends or credit institutions write to provide more details. AI can also dynamically modify the content of phishing emails based on recipient responses or behavior that increases the likelihood of success. Automating Attack Processes With AI Computer-based intelligence's capacity to mechanize assault designs has tremendous ramifications for people and associations. Cybercriminals can utilize AI to make their assaults more successful and harder to assault. For instance, with the assistance of organizations and designs, AI can naturally examine and recognize weaknesses and gather information more quickly than people. This permits aggressors to distinguish weaknesses and foster methodologies to take advantage of. Simulated intelligence-based devices can mechanize malware organization at a palatable rate and time utilizing restrictive calculations. Simulated intelligence in the framework forestalls malware recognition through security highlights and identification strategies customized to its current circumstance. Furthermore, simulated intelligence robotizes sidelong developments inside the organization, distinguishing and exploiting outside weaknesses to expand harm. This degree of robotization implies that problematic assaults will increment and represent a more severe gamble to people and associations. The Changing Threat Landscape Consolidating artificial consciousness (simulated intelligence) into cyberattacks is, in a general sense, changing the dangerous scene, creating difficulties for all people and organizations. Generally, digital dangers have been, to a great extent, manual, depending on the inventiveness and flexibility of the aggressor. The idea of these dangers has developed as artificial brainpower has become more computerized, versatile, and practical. AI-based assaults can dissect immense measures of information to recognize weaknesses and send off profoundly designated phishing efforts to spread the most recent malware with negligible human intercession. The speed and execution of computer-based intelligence-fueled assaults imply that dangers can arise more suddenly than at any time in recent memory. For instance, simulated intelligence can mechanize the surveillance and observation stages and guide targets rapidly and precisely. This fast weakness, recognizable proof permits aggressors to take advantage of weaknesses before they are fixed, giving organizations less chance to respond. Additionally, AI can create modified malware that constantly evolves to evade detection using traditional security frameworks, making it more difficult to defend against. The level of risk that AI introduces is a significant concern. Automated tools can launch hundreds of attacks simultaneously against numerous companies and individuals. This level of automation will escalate cybercriminal attacks and overwhelm existing security features, which are generally not designed to handle such volumes and levels. AI can leverage more personal attack history to lend credibility to compromised messages or social engineering approaches tailored to specific individuals. The gap between attackers and defenders may widen as the AI era emerges. Cybersecurity professionals must stay ahead of technology and the skills necessary for everyday life. This includes using AI for security purposes, such as predictive analysis, comprehensive anomaly detection, and risk assessment and mitigation. The constantly evolving threat landscape necessitates a proactive and adaptable approach to cybersecurity, and the adoption of AI for security purposes is crucial to prevent AI-based attacks. Defense Strategies Against AI-Powered Ransomware Safeguarding against AI-fuelled ransomware requires a complex methodology that utilizes cutting-edge innovation and robust procedures. Customary security highlights are presently inadequate to manage AI-driven dangers that can be adjusted and developed. Associations should use a blend of preventive insight and responsive systems to battle these assaults. Simulated intelligence and gadget first screening should be joined with network safety securities. The innovation can continuously break down enormous volumes of logs to recognize peculiarities and potential dangers that are unclear to human examiners. For instance, a simulated intelligence-controlled security framework can see uncommon ways of behaving or quick reaction examples and access information that shows progressing assaults. AI calculations can likewise work over the long haul to be more adept at distinguishing simulated intelligence-fuelled ransomware highlights. It is vital to carry out a staggered security approach. This incorporates the utilization of firewalls, the upkeep of refreshed enemies of infection and against malware programming, and the utilization of interruption location and counteraction structures. Routinely endlessly refreshing the product is vital to close weaknesses that can take advantage of the payoff. Also, end identification and reaction (EDR) apparatuses can give non-problematic following and final stage appraisals to help rapidly distinguish and alleviate dangers. Staff preparation and application centers are other fundamental perspectives. Since phishing is a typical vector of ransomware assaults, preparing representatives to recognize and report dubious messages can decrease risk. Mimicked sport fishing exercises can assist with working on these norms and guarantee that administrators keep on track. Furthermore, it is critical to have a strong reaction plan. This plan should incorporate customary reinforcements of fundamental information, so reinforcements are put away disconnected or in a distant stockpiling climate. In case of this assault, the gathering can reestablish their designs without capitulating to recover requests. Conclusion Campaigns focusing on AI-fuelled ransomware address another period in network safety, causing what is happening to people and organizations. As cyber criminals influence artificial brainpower to build the complexity and size of assaults, conventional security capacities are becoming progressively deficient. Viable insurance requires a thorough methodology that integrates artificial reasoning innovation and high-level motors into the security structure, utilizes numerous layers of security, keeps up with thorough representative preparation, and executes preventive measures. Also, organizations need to execute hearty occurrence reaction plans and consistently minimize essential reinforcements to relieve the effect of assaults on usefulness. Remaining before the always-changing dangerous scene requires advancement and steady watchfulness. By proactively adjusting to these high-level dangers, people and organizations can more likely safeguard their virtual merchandise and guarantee security against the developing threat of computer-based intelligence-controlled ransomware.
Hey, DZone Community! We have an exciting year ahead of research for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you choose) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Comic by Daniel Stori Generative AI Research Generative AI is revolutionizing industries, and software development is no exception. At DZone, we're diving deep into how GenAI models, algorithms, and implementation strategies are reshaping the way we write code and build software. Take our short research survey ( ~10 minutes) to contribute to our latest findings. We're exploring key topics, including: Embracing generative AI (or not)Multimodal AIThe influence of LLMsIntelligent searchEmerging tech And don't forget to enter the raffle for a chance to win an e-gift card of your choice! Join the GenAI Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our Trend Reports. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team
DZone events bring together industry leaders, innovators, and peers to explore the latest trends, share insights, and tackle industry challenges. From Virtual Roundtables to Fireside Chats, our events cover a wide range of topics, each tailored to provide you, our DZone audience, with practical knowledge, meaningful discussions, and support for your professional growth. DZone Events Happening Soon Below, you’ll find upcoming events that you won't want to miss. What to Consider When Building an IDP Date: March 4, 2025Time: 1:00 PM ET Register for Free! Is your development team bogged down by manual tasks and “TicketOps”? Internal Developer Portals (IDPs) streamline onboarding, automate workflows, and enhance productivity—but should you build or buy? Join Harness and DZone for a webinar to explore key IDP capabilities, compare Backstage vs. managed solutions, and learn how to drive adoption while balancing cost and flexibility. DevOps for Oracle Applications with FlexDeploy: Automation nd Compliance Made Easy Date: March 11, 2025Time: 1:00 PM ET Register for Free! Join Flexagon and DZone as Flexagon's CEO unveils how FlexDeploy is helping organizations future-proof their DevOps strategy for Oracle Applications and Infrastructure. Explore innovations for automation through compliance, along with real-world success stories from companies who have adopted FlexDeploy. Make AI Your App Development Advantage: Learn Why and How Date: March 12, 2025Time: 10:00 AM ET Register for Free! The future of app development is here, and AI is leading the charge. Join Outsystems and DZone, on March 12th at 10am ET, for an exclusive Webinar with Luis Blando, CPTO of OutSystems, and John Rymer, industry analyst at Analysis.Tech, as they discuss how AI and low-code are revolutionizing development.You will also hear from David Gilkey, Leader of Solution Architecture, Americas East at OutSystems, and Roy van de Kerkhof, Director at NovioQ. This session will give you the tools and knowledge you need to accelerate your development and stay ahead of the curve in the ever-evolving tech landscape. Developer Experience: The Coalescence of Developer Productivity, Process Satisfaction, and Platform Engineering Date: March 12, 2025Time: 1:00 PM ET Register for Free! Explore the future of developer experience at DZone’s Virtual Roundtable, where a panel will dive into key insights from the 2025 Developer Experience Trend Report. Discover how AI, automation, and developer-centric strategies are shaping workflows, productivity, and satisfaction. Don’t miss this opportunity to connect with industry experts and peers shaping the next chapter of software development. Unpacking the 2025 Developer Experience Trends Report: Insights, Gaps, and Putting it into Action Date: March 19, 2025Time: 1:00 PM ET Register for Free! We’ve just seen the 2025 Developer Experience Trends Report from DZone, and while it shines a light on important themes like platform engineering, developer advocacy, and productivity metrics, there are some key gaps that deserve attention. Join Cortex Co-founders Anish Dhar and Ganesh Datta for a special webinar, hosted in partnership with DZone, where they’ll dive into what the report gets right—and challenge the assumptions shaping the DevEx conversation. Their take? Developer experience is grounded in clear ownership. Without ownership clarity, teams face accountability challenges, cognitive overload, and inconsistent standards, ultimately hampering productivity. Don’t miss this deep dive into the trends shaping your team’s future. Accelerating Software Delivery: Unifying Application and Database Changes in Modern CI/CD Date: March 25, 2025Time: 1:00 PM ET Register for Free! Want to speed up your software delivery? It’s time to unify your application and database changes. Join us for Accelerating Software Delivery: Unifying Application and Database Changes in Modern CI/CD, where we’ll teach you how to seamlessly integrate database updates into your CI/CD pipeline. Petabyte Scale, Gigabyte Costs: Mezmo’s ElasticSearch to Quickwit Evolution Date: March 27, 2025Time: 1:00 PM ET Register for Free! For Mezmo, scaling their infrastructure meant facing significant challenges with ElasticSearch. That's when they made the decision to transition to Quickwit, an open-source, cloud-native search engine designed to handle large-scale data efficiently. This is a must-attend session for anyone looking for insights on improving search platform scalability and managing data growth. What's Next? DZone has more in store! Stay tuned for announcements about upcoming Webinars, Virtual Roundtables, Fireside Chats, and other developer-focused events. Whether you’re looking to sharpen your skills, explore new tools, or connect with industry leaders, there’s always something exciting on the horizon. Don’t miss out — save this article and check back often for updates!
Amazon Aurora PostgreSQL-compatible edition major version 12.x and Amazon RDS for PostgreSQL 12 reach the end of standard support on February 28, 2025. Higher database versions introduce new features, enhancing operational efficiency and cost-effectiveness. Identifying qualified databases and upgrading them promptly is crucial. As the end of standard support is approaching, it's crucial for database administrators and developers to understand the implications and plan for the future. This article discusses PostgreSQL 12's end-of-standard support for Aurora and RDS PostgreSQL 12, Amazon RDS extended support, upgrade options, and the benefits of moving to PostgreSQL 16. Understanding PostgreSQL End-of-Life The PostgreSQL Global Development Group typically releases a new major version annually, introducing new features and improvements. Concurrently, older versions reach their end of life (EOL) after about five years of support. PostgreSQL 12, released in October 2019, is scheduled to reach its end of life in November 2024. When a PostgreSQL version reaches EOL, it no longer receives updates, bug fixes, or security patches from the community. This lack of support can leave databases vulnerable to security risks and performance issues, making it essential for users to upgrade to newer, supported versions. Aurora and RDS PostgreSQL End-of-Standard Support For users of Amazon Aurora and Amazon RDS PostgreSQL, the end of standard support for version 12 is set for February 28, 2025. After this date, Aurora PostgreSQL will continue to release patches for extended supported versions, while RDS will provide new minor versions with bug fixes and CVE patches for extended supported versions. Key points to remember: After February 28, 2025, you can still create Aurora PostgreSQL LTS 12.9 and 12.20, and RDS PostgreSQL 12.22 databases, which will be automatically enrolled in Extended Support. When restoring PostgreSQL 12 database snapshots, you'll have the option to disable RDS extended support.If RDS extended support is disabled, the database will be forced to upgrade. If the upgrade fails, the database will remain in extended support and continue to incur charges. Amazon RDS Extended Support Amazon RDS Extended Support is a valuable service for users who need additional time to plan and execute their database upgrades. This service provides continued security and critical bug fix patches for a limited period beyond the end of the standard support date. Benefits of RDS Extended Support Continued security patches and critical bug fixesAdditional time to plan and execute upgradesFlexibility in managing database versions It's important to note that while Extended Support provides a safety net, it should not be considered a long-term solution. Users should actively plan for upgrading to a newer, fully supported version of PostgreSQL. Upgrade Options for Aurora or RDS PostgreSQL When it comes to upgrading Aurora or RDS PostgreSQL databases, users have three main options: 1. In-Place Upgrade Managed by RDSCan be performed using Amazon RDS console or AWS CLIRequires downtime 2. Amazon RDS Blue or Green Deployments Can be performed using Amazon RDS consoleMinimizes risks and downtimeRestricts some operations until the upgrade is completed 3. Out-of-Place Upgrade Reduces downtime by upgrading a copy of the databaseInvolves continuous replication with the new version until the final cutoverRequires more manual steps and planning The below image shows the pros and cons of the above upgrade options: Each upgrade option has its advantages and considerations. The choice depends on factors such as downtime tolerance, database size, and operational requirements. Benefits of Upgrading to PostgreSQL 16 PostgreSQL 16, released in September 2023, brings significant improvements and new features that make it an attractive upgrade target. Here are some key benefits: 1. Performance Improvements Enhanced query planner with parallelization of FULL and RIGHT joinsOptimized execution of window functionsImproved bulk loading performance for COPY commandsVacuum strategy enhancements reducing the need for full-table freezesCPU acceleration through SIMD technology for x86 and ARM architectures 2. Logical Replication Enhancements Bidirectional replicationReplication from standby (RDS PostgreSQL only)Performance improvements for logical replication 3. Monitoring and Diagnostics Introduction of pg_stat_io view for detailed I/O metricsNew columns in pg_stat_all_tables and pg_stat_all_indexes viewsImproved auto_explain readability and query tracking accuracy 4. Security Enhancements Security-focused client connection parametersSupport for Kerberos credential delegationIntroduction of the require_auth function for specifying acceptable authentication parameters It's important to note that as managed database services, some features may not apply to Aurora or RDS PostgreSQL databases. Planning Your Upgrade As the end of life for PostgreSQL 12 approaches, it's crucial to start planning your upgrade strategy. Here are some steps to consider: Assess your current PostgreSQL 12 deployment and identify any dependencies or custom configurations.Review the features and improvements in PostgreSQL 16 to understand the potential benefits for your specific use case.Choose the most appropriate upgrade method based on your downtime tolerance and operational requirements.Create a test environment to validate the upgrade process and identify any potential issues.Develop a rollback plan in case of unexpected problems during the upgrade.Schedule the upgrade during a low-traffic period to minimize disruption.After the upgrade, monitor the database closely for any performance changes or issues. The below image depicts a typical flowchart for a PostgreSQL database upgrade: Conclusion The end of life for PostgreSQL 12 presents both challenges and opportunities for database administrators and developers. While it necessitates careful planning and execution of upgrades, it also provides a chance to leverage the latest features and improvements in PostgreSQL 16. By understanding the implications of end of life, exploring upgrade options, and preparing thoroughly, organizations can ensure a smooth transition to a newer, more robust version of PostgreSQL. Remember, staying current with supported database versions is not just about accessing new features — it's a critical aspect of maintaining the security, performance, and reliability of your database infrastructure. Start planning your PostgreSQL upgrade today to ensure your databases remain supported and optimized for the future.
Artificial intelligence operates as a transformative force that transforms various industries, including healthcare, together with finance and all other sectors. AI systems achieve their highest performance through data that has been properly prepared for training purposes. AI success depends on high-quality data because inaccurate all-inclusive or duplicated data or conflicting records lead to both diminished performance and higher operational costs, biased decisions, and flawed insights. AI developers understate the true impact of dirty data-related expenses because these factors directly affect business performance levels together with user trust and project achievement. The Financial Burden of Poor Data Quality The financial costs represent one direct expense related to using dirty data during AI development processes. Organizations that depend on AI systems for decision automation need to budget sizable expenses toward cleaning data, preparing it for processing, and validating existing datasets. Studies show poor data quality annually creates millions of dollars of financial losses through several efficiency issues, prediction mistakes, and resource ineffectiveness. Faulty data that train AI models sometimes leads businesses to make mistakes involving resource wastage and incorrect targeting of customers, followed by incorrect healthcare diagnoses of patients. Cleaning and fixing wrong data creates additional work that stresses out the engineering and data science personnel while resulting in financial costs. Data professionals dedicate major portions of their working hours to data cleaning tasks, which diverts essential attention from model optimization and innovation work. The inefficient process of dealing with impaired data leads both to slower AI development timelines and elevated operational expenses, which make projects unprofitable and delay the release of AI-derived products. Bias and Ethical Risks The presence of dirty data leads AI models to develop and strengthen biases which produces unethical and biased results. The performance quality of AI depends entirely on its training data because biases in this input will result in the AI producing biased outputs. Fair and unbiased AI systems operate less effectively in facial recognition and hiring algorithms and decision-based lending processes because of their inherent prejudices against specific population sectors. The utilization of biased AI produces serious damage to organizational reputation. AI solutions with built-in biases will trigger legal compliance problems for organizations while angering customers and leading regulators to inspect them. Adjusting AI bias after deployment requires additional difficulty and expenses that exceed the costs involved in data quality maintenance during development. Companies must establish data sets that are clean with diversity and representativeness at the beginning to minimize ethical risks and advance AI fairness as well as reliability. Decreased Model Performance and Accuracy High-quality data serves as the foundation that makes AI models efficient in their predictive tasks, yet corrupt data makes them produce inaccurate forecasts. The presence of dirty data creates inconsistencies, which makes it complicated for machine learning algorithms to discover significant patterns. A predictive maintenance system in manufacturing using AI will deliver poor results if it trains using corrupted sensor readings because this causes equipment failure detection failures that create unexpected equipment breakdowns with costly operational stoppages. AI-powered customer support chatbots deliver untrustworthy information to users after learning from imprecise data, which debilitates customer trust in brands. The performance issues caused by dirty data compel companies to constantly regulate their AI systems by retraining and manual adjustments, which leads to expenses that diminish overall operational effectiveness. Initiating data quality resolutions at the beginning of development produces more durable and dependable AI system models. Compliance and Regulatory Challenges Organizations face substantial challenges in complying with GDPR and CCPA privacy regulations because of the existing dirty data risk in their systems. Data protection laws get violated when organizations store inaccurate or duplicated data which leads to substantial legal consequences together with substantial financial penalties. Companies that work with sensitive financial and health-related information need to guarantee accurate data because it is required by regulatory rules. The regulation of AI systems through explainable functions and transparent decision-making processes constitutes a newer demand from both regulatory bodies and key stakeholders. Flawed data sources combined with untraceable AI decisions threaten the trust of users and regulators because organizations cannot defend their artificial intelligence-based decisions. Organizations that establish robust data governance protocols alongside validation systems achieve regulatory compliance and enhance transparency and accountability within their AI systems. The Role of Data Governance in Mitigating Dirty Data The successful execution of data governance requires proactive measures to reduce the negative effects of dirty data during AI development. Organizations need to develop complete data management systems that combine data assessment with data reduction methods and sustained examination procedures. The combination of standardized data entry approaches together with automated data cleaning systems diminishes data errors which prevents them from damaging AI models before implementation. Organizations need to develop data responsibility systems that establish essential practices throughout their operational culture. Employees need training about correct data handling procedures while working with data engineers and scientists alongside business members to achieve improved data quality results. Strong data governance structures deployed by organizations cut down AI errors and operational threats and deliver the maximum possible benefits from AI innovation. The Path Forward: Addressing Dirty Data Challenges The implementation of AI requires clean data because imprecise data leads to extensive financial consequences and damages ethical principles as well as decreases model efficiency and disrupts regulatory requirements. AI success heavily relies on the accuracy of underlying data since the technology requires high-quality data. Organizations need to develop strong data management approaches, together with data cleaning tools and governance rules, to reduce the dangers that stem from unusable data quality. Addressing dirty data points at the beginning of the AI pipeline enables businesses to boost their AI reliability, establish user trust, and achieve maximum value from their AI-powered projects.
Identity and access management (IAM), service IDs, and service credentials are crucial components in securing and managing access to object storage services across various cloud platforms. These elements work together to provide a robust framework for controlling who can access data stored in the cloud and what actions they can perform. In the previous article, you were introduced to the top tools for object storage and data management. In this article, you will learn how to restrict access (read-only) to the object storage bucket through custom roles, access groups, and service credentials. Identity and Access Management (IAM) IAM is a comprehensive system for managing access to cloud resources, including object storage. It allows organizations to control who has access to what resources and what actions they can perform on those resources. IAM is built on several key concepts: Users – Individual accounts that represent people or applications accessing cloud resourcesRoles – Collections of permissions that define what actions can be performed on resourcesPolicies – Rules that associate users or groups with roles, determining their access levelsResources – The cloud assets being protected, such as buckets and objects in object storage In the context of object storage, IAM enables administrators to: Grant read, write, or delete permissions on specific buckets or objectsImplement the principle of least privilege by assigning only necessary permissionsManage access across multiple teams or projects within an organizationEnforce compliance requirements by controlling data access IAM systems typically offer predefined roles that cover common use cases, such as "Object Creator" or "Object Viewer." These roles can be assigned to users or groups to quickly set up appropriate access levels. For more granular control, custom roles can often be created to meet specific organizational needs. Service IDs Service IDs are a concept closely related to IAM, particularly useful in the context of object storage. A service ID is a unique identifier that represents an application or service, rather than an individual user. Service IDs are essential for scenarios where: Applications need programmatic access to object storageAutomated processes require authentication to perform tasksLong-lived access is needed without tying it to a specific user account Service IDs can be assigned IAM roles and policies, just like user accounts. This allows for precise control over what actions an application or service can perform on object storage resources. For example, a backup application might be given a service ID with permissions to read and write to specific buckets, but not to delete objects or modify bucket settings. Key advantages of using service IDs include: Separation of concerns. Access to applications is managed separately from user access.Auditability. Actions performed using service IDs can be tracked and audited.Lifecycle management. Service IDs can be created, modified, or revoked without affecting user accounts.Scalability. Multiple applications can use the same service ID if they require identical permissions. Service Credentials Service credentials are the authentication details associated with service IDs. They provide the necessary information for applications to securely connect to and interact with object storage services. Service credentials typically include: API keys – Unique identifiers used for authenticationSecret keys – Confidential strings used in combination with API keys to verify identityEndpoints – URLs specifying where to send API requestsResource identifiers – Unique strings identifying specific object storage instances or resources Service credentials are crucial for enabling secure programmatic access to object storage. They allow applications to authenticate and perform actions without requiring interactive login processes. When working with service credentials, it's important to: Securely store and manage these credentials to prevent unauthorized accessImplement credential rotation practices to periodically update access keysUse environment variables or secure secret management systems to inject credentials into applications HMAC Credentials Hash-based message authentication code (HMAC) credentials are a specific type of authentication mechanism used in object storage systems. HMAC credentials consist of an access key ID and a secret access key. They play a crucial role in securing object storage by providing a way to sign API requests and ensuring the integrity and authenticity of the requests. HMAC credentials are particularly important for maintaining compatibility with S3-style APIs and tools across different cloud providers. Use Case: Read-Only Access to the Object Storage Bucket Let's bring everything together. One use case that cumulates IAM (custom roles), service credentials, and service ID is to provide read-only access to a specific object storage bucket. This restricted access will be applied to the access via UI and tools like CyberDuck or S3browser. Relationship diagram Create a Custom Role Follow the instructions here to create a custom role with the below actions: cloud-object-storage.account.get_account_buckets – Read access to the buckets in a cloud object storage service instance.resource-controller.credential.retrieve_all – Read access to the service credentials.cloud-object-storage.object.get_tagging – Read access to the object storage tags Create an Access Group Follow the instructions to create an access group with policies restricting access to the specific object storage service and bucket-level access to the individual users with the custom role that you created. Remember to restrict the resource to a bucket and service access to Content Reader and Object Reader for read-only access. Create a Service Credential With a Service ID and HMAC-Enabled 1. Follow the instructions to create a service credential with ServiceID. Specify None to assign no role to the new credential as we will manage access by associating a service ID with the service credential. Create object storage service credentials with HMAC enabled. HMAC credentials include the access key and secret key required to connect from external tools like MINio, Cyberduck, S3Browser. Check my article on how to use these tools with HMAC credentials. 2. Assign the access group that is created above to the created ServiceID. As mentioned in the article here, using the HMAC credentials will now restrict access to read-only on the specific object storage bucket. Cloud Providers and Object Storage Several major cloud providers offer object storage services, each with its own implementation of IAM, service IDs, and credentials: Amazon Web Services (AWS) S3. Uses AWS identity and access management (IAM)Offers IAM users and rolesProvides access keys and secret access keys for authentication Google Cloud Storage. Integrates with Google Cloud IAMSupports service accounts for application-level accessOffers HMAC keys for interoperability Microsoft Azure Blob Storage. Uses Azure Active Directory (Azure AD) for identity managementProvides managed identities for Azure resourcesOffers shared access signatures (SAS) for granular, time-limited access IBM Cloud Object Storage. Implements IBM Cloud IAMSupports service IDs for application authenticationProvides HMAC credentials for S3 API compatibility Oracle Cloud Infrastructure (OCI) Object Storage. Uses OCI IAM for access managementOffers instance principals for compute instances to access object storageSupports customer-managed keys for enhanced security These providers generally follow similar principles in their IAM and credential management systems, but with variations in terminology and specific features. For example, what IBM calls a "Service ID" might be referred to as a "Service Account" in Google Cloud or a "Service Principal" in Azure. Best Practices for IAM and Credentials in Object Storage To effectively manage access and security in object storage using IAM, Service IDs, and Service Credentials, consider the following best practices: Implement the principle of least privilege. Grant only the minimum necessary permissions to users and service IDs.Use service IDs for application access. Avoid using personal user credentials for application authentication.Regularly audit and review access. Periodically check and update IAM policies and service ID permissions.Implement credential rotation. Establish processes for regularly updating service credentials and API keys.Secure credential storage. Use encrypted storage and secure secret management systems for storing service credentials.Enable multi-factor authentication. For user accounts with high-level permissions, enable MFA for added security.Use IAM roles and groups. Leverage roles and groups to simplify permission management for multiple users or applications.Monitor and log access. Implement comprehensive logging and monitoring of object storage access and operations.Educate team members. Ensure that all team members understand IAM concepts and follow security best practices.Align with compliance requirements. Ensure that IAM policies and practices meet relevant industry standards and regulations. Conclusion In conclusion, IAM, service IDs, and service credentials form a critical triad in securing and managing access to object storage across various cloud platforms. Understanding these concepts and implementing them effectively would enable organizations to ensure that their data remains secure, accessible to the right entities, and compliant with relevant regulations. As cloud technologies evolve, staying informed about the latest developments in identity and access management will be crucial for maintaining robust security in object storage implementations.
Scanning file uploads for viruses, malware, and other threats is standard practice in any application that processes files from an external source. No matter which antimalware we use, the goal is always the same: to prevent malicious executables from reaching a downstream user (directly, via database storage, etc.) or automated workflow that might inadvertently execute the malicious content. In this article, we’ll discuss the value of quarantining malicious files after they’re flagged by an antimalware solution instead of outright deleting them. We’ll highlight several APIs Java developers can leverage to quarantine malicious content seamlessly in their application workflow. Deleting vs. Quarantining Malicious Files While there’s zero debate around whether external files should be scanned for malicious content, there’s a bit more room for debate around how malicious files should be handled once antimalware policies flag them. The simplest (and overall safest) option is to programmatically delete malicious files as soon as they’re flagged. The logic for deleting a threat is straightforward: it completely removes the possibility that downstream users or processes might unwittingly execute the malicious content. If our antimalware false positive rate is extremely low — which it ideally should be — we don’t need to spend too much time debating whether the file in question was misdiagnosed. We can shoot first and ask questions later. When we elect to programmatically quarantine malicious files, we take on risk in an already-risky situation — but that risk can yield significant rewards. If we can safely contain a malicious file within an isolated directory (e.g., a secure zip archive), we can preserve the opportunity to analyze the threat and gain valuable insights from it. This is a bit like sealing a potentially venomous snake in a glass container; with a closer look, we can find out if the snake is truly dangerous, misidentified, or an entirely unique specimen that demands further study to adequately understand. In quarantining a malicious file, we might be preserving the latest update of some well-known and oft-employed black market malware library, or in cases involving heuristic malware detection policies, we might be capturing an as-of-yet-unseen malware iteration. Giving threat researchers the opportunity to analyze malicious files in a sandbox can, for example, tell us how iterations of a known malware library have evolved, and in the event of a false-positive threat diagnosis, it can tell us that our antimalware solution may need an urgent update. Further, quarantining gives us the opportunity to collect useful data about the attack vectors (in this case, insecure file upload) threat actors are presently exploiting to harm our system. Using ZIP Archives as Isolated Directories for Quarantine The simplest and most effective way to quarantine a malicious file is to lock it within a compressed ZIP archive. ZIP archives are well-positioned as lightweight, secure, and easily transferrable isolated directories. After compressing a malicious file in a ZIP archive, we can encrypt the archive to restrict access and prevent accidental execution, and we can apply password-protection policies to ensure only folks with specific privileges can decrypt and “unzip” the archive. Open-Source APIs for Handling ZIP Compression, Encryption, and Password-Protection in Java In Java, we have several open-source tools at our disposal for archiving a file securely in any capacity. We could, for example, use the Apache Commons Compress library to create the initial zip archive that we compress the malicious file in (this library adds some notable features to the standard java.util.zip package), and we could subsequently use a robust cryptography API like Tink (by Google) to securely encrypt the archive. After that, we could leverage another popular library like Zip4j to password protect the archive (it's worth noting we could handle all three steps via Zip4j if we preferred; this library features the ability to create archives, encrypt them with AES or other zip standard encryption methods, and create password protection policies). Creating a ZIP Quarantine File With a Web API If open-source technologies won’t fit into the scope of our project, another option is to use a single, fully realized zip quarantine API in our Java workflow. This can help simplify the end-to-end quarantining process and mitigate some of the risks involved in handling malicious files by abstracting the entire process to an external server. Below, we’ll walk through how to implement one such solution into our Java project. This solution is free to use with a free API key, and it offers a simple set of parameters for creating a password, compressing a malicious file, and encrypting the archive. We can install the Java SDK with Maven by first adding a reference to the repository in pom.xml: XML <repositories> <repository> <id>jitpack.io</id> <url>https://jitpack.io</url> </repository> </repositories> And after that, we can add a reference to the dependency in pom.xml: XML <dependencies> <dependency> <groupId>com.github.Cloudmersive</groupId> <artifactId>Cloudmersive.APIClient.Java</artifactId> <version>v4.25</version> </dependency> </dependencies> For a Gradle project, we could instead place the below snippet in our root build.gradle: Groovy allprojects { repositories { ... maven { url 'https://jitpack.io' } } } And we could then add the following dependency in our build.gradle: Groovy dependencies { implementation 'com.github.Cloudmersive:Cloudmersive.APIClient.Java:v4.25' } With installation out of the way, we can copy the import classes at the top of our file: Java // Import classes: //import com.cloudmersive.client.invoker.ApiClient; //import com.cloudmersive.client.invoker.ApiException; //import com.cloudmersive.client.invoker.Configuration; //import com.cloudmersive.client.invoker.auth.*; //import com.cloudmersive.client.ZipArchiveApi; Now, we can configure our API key to authorize the zip quarantine request: Java ApiClient defaultClient = Configuration.getDefaultApiClient(); // Configure API key authorization: Apikey ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey"); Apikey.setApiKey("YOUR API KEY"); // Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null) //Apikey.setApiKeyPrefix("Token"); Finally, we can create an instance of the ZipArchiveApi and configure our password, file input, and encryption parameters. We can customize our encryption algorithm by selecting from one of three options: AES-256, AES-128, and PK-Zip (AES-256 is the default value if we leave this parameter empty; PK-Zip is technically a valid option but not recommended). We can then call the API and handle errors via the try/catch block. Java ZipArchiveApi apiInstance = new ZipArchiveApi(); String password = "password_example"; // String | Password to place on the Zip file; the longer the password, the more secure File inputFile1 = new File("/path/to/inputfile"); // File | First input file to perform the operation on. String encryptionAlgorithm = "encryptionAlgorithm_example"; // String | Encryption algorithm to use; possible values are AES-256 (recommended), AES-128, and PK-Zip (not recommended; legacy, weak encryption algorithm). Default is AES-256. try { Object result = apiInstance.zipArchiveZipCreateQuarantine(password, inputFile1, encryptionAlgorithm); System.out.println(result); } catch (ApiException e) { System.err.println("Exception when calling ZipArchiveApi#zipArchiveZipCreateQuarantine"); e.printStackTrace(); } After the API returns our quarantined file, we can upload the archive to a cloud-based quarantine repository, transfer it to a virtual machine, or take any number of different actions. Conclusion In this article, we discussed the benefits of quarantining malicious files after our antimalware software flags them. We then highlighted several open-source Java libraries that can be collectively used to quarantine malicious files in an encrypted, password-protected zip archive. Finally, we highlighted one fully realized (not open source) web API solution for handling each stage of that process with minimal code.
When I began my journey into the field of AI and large language models (LLMs), my initial aim was to experiment with various models and learn about their effectiveness. Like most developers, I also began using cloud-hosted services, enticed by the ease of quick setup and availability of ready-to-use LLMs at my fingertips. But pretty quickly, I ran into a snag: cost. It is convenient to use LLMs in the cloud, but the pay-per-token model can suddenly get really expensive, especially when working with lots of text or asking many questions. It made me realize I needed a better way to learn and experiment with AI without blowing my budget. This is where Ollama came in, and it offered a rather interesting solution. By using Ollama, you can: Load and experiment with multiple LLMs locallyAvoid API rate limits and usage restrictionsCustomize and fine-tune LLMs In this article, we will explore how to build a simple document summarization tool using Ollama, Streamlit, and LangChain. Ollama allows us to run LLMs locally, Streamlit provides a web interface so that users may interact with those models smoothly, and LangChain offers pre-built chains for simplified development. Environment Setup Ensure Python 3.12 or higher is installed.Download and install OllamaFetch llama3.2 model via ollama run llama3.2I prefer to use Conda for managing dependencies and creating isolated environments. Create a new Conda environment and then install the necessary packages mentioned below. Shell pip install streamlit langchain langchain-ollama langchain-community langchain-core pymupdf Now, let's dive into building our document summarizer. We will start by creating a Streamlit app to handle uploading documents and displaying summaries in a user-friendly interface. Next, we will focus on pulling the text out of the uploaded documents (supports only PDF and text documents) and preparing everything for the summarization chain. Finally, we will bring Ollama to actually perform the summarization utilizing its local language model capabilities to generate concise and informative summaries. The code below contains the complete implementation, with detailed comments to guide you through each step. Python import os import tempfile import streamlit as stlit from langchain_text_splitters import CharacterTextSplitter from langchain.chains.summarize import load_summarize_chain from langchain_ollama import OllamaLLM from langchain_community.document_loaders import PyMuPDFLoader from langchain_core.documents import Document # Create Streamlit app by page configuration, title and a file uploader stlit.set_page_config(page_title="Local Document Summarizer", layout="wide") stlit.title("Local Document Summarizer") # File uploader that accepts pdf and txt files only uploaded_file = stlit.file_uploader("Choose a PDF or Text file", type=["pdf", "txt"]) # Process the uploaded file and extracts text from it def process_file(uploaded_file): if uploaded_file.name.endswith(".pdf"): with tempfile.NamedTemporaryFile(delete=False) as temp_file: temp_file.write(uploaded_file.getvalue()) loader = PyMuPDFLoader(temp_file.name) docs = loader.load() extracted_text = " ".join([doc.page_content for doc in docs]) os.unlink(temp_file.name) else: # Read the content directly for text files, no need for tempfile extracted_text = uploaded_file.getvalue().decode("utf-8") return extracted_text # Process the extracted text and return summary def summarize(text): # Split the text into chunks for processing and create Document object chunks = CharacterTextSplitter(chunk_size=500, chunk_overlap=100).split_text(text) docs = [Document(page_content=chunk) for chunk in chunks] # Initialize the LLM with llama3.2 model and load the summarization chain chain = load_summarize_chain(OllamaLLM(model="llama3.2"), chain_type="map_reduce") return chain.invoke(docs) if uploaded_file: # Process and preview the uploaded file content extracted_text = process_file(uploaded_file) stlit.text_area("Document Preview", extracted_text[:1200], height=200) # Generate a summary of the extracted text if stlit.button("Generate Summary"): with stlit.spinner("Summarizing...may take a few seconds"): summary_text = summarize(extracted_text) stlit.text_area("Summary", summary_text['output_text'], height=400) Running the App Save the above code snippet into summarizer.py, then open your terminal, navigate to where you saved the file, and run: Shell streamlit run summarizer.py That should start your Streamlit app and automatically open in your web browser, pointing to a local URL like http://localhost:8501. Conclusion You've just completed the document summarization tool by combining Streamlit’s simplicity and Ollama’s local model hosting capabilities. This example utilizes the llama3.2 model, but you can experiment with other models to determine what is best for your needs, and you can also consider adding support for additional document formats, error handling, and customized summarization parameters. Happy AI experimenting!
Large language models (LLMs) have impacted natural language processing (NLP) by introducing advanced applications such as text generation, summarization, and conversational AI. Models like ChatGPT use a specific neural architecture called a transformer to predict the next word in a sequence, learning from enormous text datasets through self-attention mechanisms. This guide breaks down the step-by-step process for training generative AI models, including pre-training, fine-tuning, alignment, and practical considerations. Overview of the Training Pipeline Figure 1: Overview of LLM Training Pipeline The training pipeline for LLMs is a structured, multi-phase process designed to enhance linguistic understanding, task-specific capabilities, and alignment with human preferences. Data collection and preprocessing. Vast text data from diverse sources is collected, cleaned, tokenized, and normalized to ensure quality. High-quality, domain-specific data improves factual accuracy and reduces hallucinations.Pre-training. This is the foundational stage where the model learns general language patterns through self-supervised learning, a technique for the model to teach itself patterns in text without needing labeled examples. Take, for example, next token prediction. This phase relies on massive datasets and transformer architectures to build broad linguistic capabilities.Instruction fine-tuning. The model is trained on smaller, high-quality input-output datasets to specialize in specific tasks or domains. This instruction tuning step ensures more accurate and contextually appropriate outputs.Model alignment. Reinforcement learning from human feedback (RLHF) refines the model’s behavior: Reward model training. Human evaluators rank outputs to train a reward model.Policy optimization. The LLM is iteratively optimized to align with human preferences, ethical considerations, and user expectations.Evaluation and iterative fine-tuning. The model is tested on unseen data to evaluate metrics like accuracy and coherence. Further fine-tuning may follow to adjust hyperparameters or incorporate new data.Downstream application adaptation. The trained LLM is adapted for real-world applications (e.g., chatbots, content generation) through additional fine-tuning or integration with task-specific frameworks. This pipeline transforms LLMs from general-purpose models into specialized tools capable of addressing diverse tasks effectively. 1. Pre-Training Pre-training is the foundational stage in the development of LLMs, where a model learns general language patterns and representations from vast amounts of text data. This phase teaches the model grammar rules, contextual word relationships, and basic logical patterns (e.g., cause-effect relationships in text), thus forming the basis for its ability to perform diverse downstream tasks. How Pre-Training Works Figure 2: High-level overview of the pre-training stage Objective The primary goal of pre-training is to enable the model to predict the next token in a sequence. This is achieved through causal language modeling (CLM), which is a way to teach the model to predict what comes next in a sentence. In this step, the model learns to generate coherent and contextually relevant text by looking only at the past tokens. Datasets Pre-training requires massive and diverse datasets sourced from books, articles, websites, and other publicly available content. Popular datasets include Common Crawl, Wikipedia, The Pile, and BookCorpus. These datasets are often cleaned and normalized to ensure high-quality input with techniques like deduplication and tokenization applied during preprocessing. Long-context data is curated to increase the context length of the model. Pre-Training Process The model learns to predict the next token in a sequence through causal language modeling. The model predictions are compared to actual next words using a cross-entropy loss function, which measures the model performance during training. Model parameters are continuously adjusted to minimize prediction errors or loss until the model reaches an acceptable accuracy level.The pre-training phase requires significant computational resources, often utilizing thousands of GPU hours across distributed systems to process the massive datasets needed for effective training. This is a self-supervised learning approach where the model learns patterns directly from raw text without manual labels. Thus, eliminating costly human annotations by having the model predict next tokens. In the following example, we use a GPT 2 model, which was pre-trained on a very large corpus of English data in a self-supervised fashion with no humans labeling them in any way. Python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Load the model and tokenizer model = AutoModelForCausalLM.from_pretrained("gpt2") tokenizer = AutoTokenizer.from_pretrained("gpt2") input_text = "The capital of France is" # Tokenize the input text model_inputs = tokenizer([input_text], return_tensors="pt") # Run inference on the pretrained model and decode the output generated_ids = model.generate(**model_inputs, max_new_tokens=25, do_sample=True) print(tokenizer.batch_decode(generated_ids)[0]) As expected, the model is able to complete the sentence "The capital of France is" by iteratively predicting the next token as per its pre-training. Plain Text The capital of France is the city of Paris which is more prosperous than the other capitals in ... However, when phrased as a question, i.e., "What is the capital of France?" the model fails to produce the correct result because, at this stage of the training, it can't follow instructions yet. Python text2 = "What is the capital of France?" model_inputs = tokenizer([text2], return_tensors="pt") generated_ids = model.generate(**model_inputs, max_new_tokens=25, do_sample=True) print(tokenizer.batch_decode(generated_ids)[0]) Output: Plain Text What is the capital of France? In our opinion we should be able to count the number of people in France today. The government has made this a big priority Benefits of Pre-Training Broad language understanding. By training on diverse data, pre-trained models develop a comprehensive grasp of language structures and patterns, enabling them to generalize across various tasks.Efficiency. Pre-trained models can be fine-tuned for specific tasks with smaller labeled datasets, saving time and resources compared to training models from scratch for each task.Performance. Models that undergo pre-training followed by fine-tuning consistently outperform those trained solely on task-specific data due to their ability to leverage knowledge from large-scale datasets. 2. Instruction Fine-Tuning Instruction fine-tuning is a specialized training technique that transforms general-purpose LLMs into responsive, instruction-following systems. Here, the model is trained on specific tasks like answering questions or summarizing text. By training models on curated (instruction, output) pairs, this method aligns LLMs' text generation capabilities with human-defined tasks and conversational patterns. The training (instruction, output) sample looks like this: Plain Text Instruction: What is the capital of Germany? Response: The capital of Germany is Berlin. Figure 3: Instruction fine-tuning stage In the following example, we load the Gemma 2 LLM model from Google, which is instruction-tuned on a variety of text generation tasks, including question answering, summarization, and reasoning. Python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load Gemma 2 2b instruct model tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it") model = AutoModelForCausalLM.from_pretrained("google/gemma-2-2b-it") # Tokenize input input_text = "What is the capital of France?" input_ids = tokenizer(input_text, return_tensors="pt") # Run model inference and decode output outputs = model.generate(**input_ids, max_new_tokens=25, do_sample=True) print(tokenizer.decode(outputs[0])) This fine-tuned model is able to follow the following instructions: Plain Text What is the capital of France? The capital of France is Paris. How Instruction Fine-Tuning Works Objective Instruction fine-tuning bridges the critical gap between an LLM's fundamental next-word prediction capability and practical task execution by teaching models to understand and follow natural language instructions. This process transforms general-purpose LLMs into responsive, instruction-following systems that consistently follow user commands like "Summarize this article" or "Write a Python function for X." Supervised Learning Unlike pre-training, which uses self-supervised learning on unlabeled data, instruction fine-tuning employs supervised learning with labeled instruction-output pairs. The process involves: Using explicit instruction-response pairs for trainingUpdating model weights to optimize for instruction followingMaintaining the model's base knowledge while adapting response patterns Dataset The instruction dataset consists of three key components: Instruction – natural language command or requestInput – optional context or examples Output – desired response demonstrating correct task execution Plain Text Instruction: Find the solution to the quadratic equation. Context: 3x² + 7x - 4 = 0 Response: The solution of the quadratic equation is x = -4 and x = 1/3. These datasets can be created through manual curation by domain experts, synthetic generation using other LLMs, or conversion of existing labeled datasets into instruction format. Fine-Tuning Techniques Two primary approaches dominate instruction fine-tuning: Full model fine-tuning updates all model parameters, offering better performance for specific tasks at the cost of higher computational requirements.Lightweight adaptation methods (like LoRA) modify small parts of the model instead of retraining everything, significantly reducing memory requirements. Benefits of Instruction Fine-Tuning Enhanced task generalization. Models develop meta-learning capabilities, improving performance on novel tasks without specific training.Reduced prompt engineering. Fine-tuned models require fewer examples in prompts, making deployment more efficient.Controlled output: Enables precise customization of response formats and styles.Better instruction following. Bridges the gap between model capabilities and user expectations. 3. Alignment Tuning Alignment or preference tuning is a critical phase in training Large Language Models (LLMs) to ensure the model avoids harmful or biased responses. This step goes beyond improving performance on specific tasks - it focuses on making models safer, more helpful, and user-aligned by incorporating human feedback or predefined guidelines. Why Alignment Is Necessary Pre-trained LLMs are trained on massive datasets from the internet, which may contain biases, harmful content, or conflicting information. Without alignment, these models might give answers that are offensive and misleading. Alignment tuning filters harmful outputs (e.g., biased or dangerous content) using human feedback to ensure responses comply with safety guidelines. Following is an example from OpenAI's GPT-4 System Card showing the safety challenges that arise from the non-aligned "GPT-4 (early)" model. Figure 4: Safety risks in the pre-alignment version of the "GPT-4 early" model The GPT-4 system card highlights the importance of fine-tuning the model using RLHF to align the model responses with human preferences for helpfulness and harmlessness. It mitigates unsafe behavior and prevents the model from generating harmful content and biases. Key Methods for Alignment The following diagram from the DPO paper illustrates the most commonly used methods: Figure 5: (Left) RLHF workflow showing human feedback integration. (Right) DPO skips reward modeling to directly align responses Reinforcement Learning from Human Feedback (RLHF) RLHF is a machine learning technique designed to align LLMs with human values, preferences, and expectations. By incorporating human feedback into the training process, RLHF enhances the model's ability to produce outputs that are coherent, useful, ethical, and aligned with user intent. This method has been crucial for making generative models like ChatGPT and Google Gemini safer and more reliable. The RLHF process consists of three main steps: StepDescriptionOutcomeHuman FeedbackAnnotators rank outputs for relevance/ethicsPreference dataset creationReward ModelTrained to predict human preferencesQuality scoring systemPolicy OptimizationLLM fine-tuned via reinforcement learning (e.g., PPO)Aligned response generation Collecting human feedback Human annotators evaluate model-generated outputs by ranking or scoring them based on criteria such as relevance, coherence, and accuracy.Pairwise comparisons are commonly used, where annotators select the better response between two options.This feedback forms a "preference dataset" that reflects human judgment.Training a reward model A reward model is trained using the preference dataset to predict how well a given response aligns with human preferences. The reward model assigns a scalar reward score (say 0 to 10) to outputs based on human preferences to train the LLM to prioritize high-scoring responses.Fine-tuning with reinforcement learning The LLM is fine-tuned using reinforcement learning algorithms like Proximal Policy Optimization (PPO) which teaches an AI to improve gradually rather than making dramatic changes all at once. The reward model guides this process by providing feedback on generated outputs, enabling the LLM to optimize its policy for producing high-reward responses. Direct Preference Optimization (DPO) Direct Preference Optimization (DPO) is an emerging training method designed to align LLMs with human preferences. It serves as a simpler and more efficient alternative to RLHF, bypassing the need for complex reinforcement learning algorithms like Proximal Policy Optimization (PPO). Instead, DPO skips reward modeling by directly training the LLM on human-ranked responses. The preference data generation process remains the same, as highlighted in the RLHF method above. The DPO process consists of: Direct optimization Unlike RLHF, which trains a reward model and uses reinforcement learning, DPO directly fine-tunes the LLM to produce outputs that maximize alignment with the ranked preferences.This is achieved by directly training the model to favor high-ranked responses and avoid low-ranked ones.Model training. The optimization process adjusts the model’s parameters to prioritize generating responses that align with human preferences, without requiring iterative policy updates as in RLHF. Model alignment has been successfully applied across various domains: Conversational AI. Aligning chatbots with user expectations for tone, relevance, and ethical standards.Content generation. Optimizing models for tasks like summarization or creative writing based on user-defined quality metrics.Ethical AI development. Ensuring models adhere to guidelines for fairness, safety, and inclusivity without extensive computational overhead. Conclusion This guide shows you the nuts and bolts of LLM training. Are you ready to dive in? Many open-source models and datasets are waiting for you to experiment with and adapt them to solve your specific problems.
As organizations embrace Kubernetes for cloud-native applications, managing infrastructure efficiently becomes challenging. Traditional Infrastructure as Code (IaC) tools like Terraform, Pulumi, and others provide declarative configurations but lack seamless integration into the Kubernetes-native workflows. Crossplane effectively bridges the gap between Kubernetes and cloud infrastructure in this situation. In this blog, we’ll explore how Crossplane enables IaC for Kubernetes and beyond. What Is Crossplane? Crossplane is an open-source Kubernetes add-on that enables you to provision and manage cloud infrastructure using Kubernetes Custom Resource Definitions (CRDs) and the Kubernetes API. Unlike traditional IaC tools that require external execution, like Terraform scripts being run externally, Crossplane embeds the infrastructure management into Kubernetes. This makes it truly declarative and GitOps-friendly. Use Cases: Terraform vs. Crossplane When to Use Terraform? Best for managing infrastructure outside KubernetesIdeal for traditional multi-cloud deployments and VMsStrong ecosystem with extensive modules and providersWorks well with tools like Ansible, Packer, and Vault for automation When to Use Crossplane? Best for Kubernetes-centric environmentsIdeal for GitOps workflows (ArgoCD, Flux)Enables self-service provisioning via Kubernetes CRDsGood for multi-cloud Kubernetes control (managing cloud services via K8s API) Getting Started With Crossplane For this sample, we will use a minikube. But the same steps can be applied to any Kubernetes. Step 1: Deploy MySQL in Kubernetes 1. Deploy MySQL as a Deployment with a Service for configuring using Crossplane. You can also use MySQL deployed from another location. 2. Define a mysql-deployment.yaml, which creates the secret, deployment, and service required to run MySQL. YAML apiVersion: v1 kind: Secret metadata: name: mysql-root-password type: Opaque data: password: cGFzc3dvcmQ= # Base64 encoded "password" --- apiVersion: v1 kind: Service metadata: name: mysql-service spec: selector: app: mysql ports: - protocol: TCP port: 3306 targetPort: 3306 --- apiVersion: apps/v1 kind: Deployment metadata: name: mysql spec: selector: matchLabels: app: mysql strategy: type: Recreate template: metadata: labels: app: mysql spec: containers: - image: mysql:8.0 name: mysql env: - name: MYSQL_ROOT_PASSWORD valueFrom: secretKeyRef: name: mysql-root-password key: password ports: - containerPort: 3306 name: mysql 3. Apply the YAML using the command kubectl apply -f mysql-deployment.yaml. 4. Verify the pods are up using the command kubectl get pods. 5. Verify the MySQL connection by starting a temporary SQL pod to check the MySQL deployment. Create the client by using the command kubectl run mysql-client --image=mysql:8.0 -it --rm -- bash. 6. Connect to MySQL inside the pod by using the command mysql -h mysql-service.default.svc.cluster.local -uroot -ppassword. Step 2: Install Crossplane on Kubernetes 1. Install Crossplane using Helm: Shell kubectl create namespace crossplane-system helm repo add crossplane-stable https://charts.crossplane.io/stable helm repo update helm install crossplane crossplane-stable/crossplane --namespace crossplane-system Note: Crossplane takes a few minutes to come up. 2. Verify Crossplane installation using the command kubectl get pods -n crossplane-system. Step 3: Install the Crossplane Provider for SQL 1. Define a MySQL provider using the below YAML content. YAML apiVersion: pkg.crossplane.io/v1 kind: Provider metadata: name: provider-sql spec: package: xpkg.upbound.io/crossplane-contrib/provider-sql:v0.9.0 2. Create the provider using the command kubectl apply -f provider.yaml. 3. Verify the provider using the following commands: kubectl get pods -n crossplane-system and kubectl get providers. Note: SQL providers take a few minutes to come up. Step 4: Configure the Crossplane MySQL Provider The provider configuration tells Crossplane how to authenticate with MySQL. Define the secrets to be created for provider usage. Update the stringData accordingly in the below YAML. Apply the YAML using kubectl apply -f mysql-secret.yaml. YAML apiVersion: v1 kind: Secret metadata: name: mysql-conn-secret namespace: default type: Opaque stringData: credentials: "root:password@tcp(mysql-service.default.svc.cluster.local:3306)" username: "root" password: "password" endpoint: "mysql-service.default.svc.cluster.local" port: "3306" Apply the below provider configuration for Crossplane, which uses the above secrets. Apply it using the command kubectl apply -f providerconfig.yaml. YAML apiVersion: mysql.sql.crossplane.io/v1alpha1 kind: ProviderConfig metadata: name: mysql-provider spec: credentials: source: MySQLConnectionSecret connectionSecretRef: name: mysql-conn-secret namespace: default Verify the provider config creation using the commands — kubectl get providerconfigs.mysql.sql.crossplane.io and kubectl get crds | grep providerconfig. Step 5. Create a MySQL Database Using Crossplane Now, use Crossplane to provision a new database. Use the below YAML and apply using kubectl apply -f mysqlinstance.yaml. YAML apiVersion: mysql.sql.crossplane.io/v1alpha1 kind: Database metadata: name: my-database spec: providerConfigRef: name: mysql-provider forProvider: binlog: true writeConnectionSecretToRef: name: db-conn namespace: default Step 6: Verify the Database Creation Verify the database creation using the command kubectl get database.mysql.sql.crossplane.io/my-database. Use the same verification steps mentioned in Step 1 to connect to MySQL to verify the creation of the database. With the above steps, you have installed Crossplane, configured the MySQL provider, and used Crossplane to provision a database. Can Terraform and Crossplane Work Together? Terraform and Crossplane can be used together for many scenarios. Scenario 1 In a complete IaC scenario, Terraform can be used to bootstrap Kubernetes clusters, and then Crossplane can be used to manage cloud resources from within Kubernetes. Terraform can also deploy Crossplane itself. This Hybrid Workflow Example can be Terraform provisions the Kubernetes cluster in any cloud provider.Crossplane manages cloud services (databases, storage, and networking) using Kubernetes CRDs. Scenario 2 Crossplane also supports a Terraform provider, which can be used to run Terraform scripts as part of Crossplane’s IaC model. Running a Terraform provider for Crossplane can be useful in several scenarios where Crossplane's native providers do not yet support certain cloud resources or functionalities. Following are the reasons to run a Terraform provider for Crossplane: Terraform has a vast ecosystem of providers, supporting many cloud services that Crossplane may not yet have native providers for.When an organization already uses Terraform for infrastructure management, there is no need to rewrite everything in Crossplane CRDs.Crossplane supports multi-cloud management, but its native providers may not cover every on-premise or SaaS integration.For organizations looking to gradually transition from Terraform to Crossplane, using Terraform providers within Crossplane can act as a hybrid solution before full migration.Running Terraform inside Crossplane brings Terraform under Kubernetes’ declarative GitOps model. Steps to Create IBM Cloud Cloudant DB Using Crossplane Step 1. Define the Terraform provider. YAML apiVersion: pkg.crossplane.io/v1 kind: Provider metadata: name: provider-terraform spec: package: xpkg.upbound.io/upbound/provider-terraform:v0.19.0 Step 2. Configure the provider. YAML apiVersion: tf.upbound.io/v1beta1 kind: ProviderConfig metadata: name: terraform-provider-ibm spec: {} Step 3. Provision a Cloudant DB in IBM Cloud by using Terraform scripts as part of the Crossplane. YAML apiVersion: tf.upbound.io/v1beta1 kind: Workspace metadata: name: ibm-cloudant-db spec: providerConfigRef: name: terraform-provider-ibm writeConnectionSecretToRef: name: ibmcloud-terraform-secret namespace: crossplane-system forProvider: source: Inline module: | terraform { required_providers { ibm = { source = "IBM-Cloud/ibm" } } backend "kubernetes" { secret_suffix = "ibmcloud-terraform-secret" namespace = "crossplane-system" } } provider "ibm" { ibmcloud_api_key = var.ibmcloud_api_key } resource "ibm_cloudant" "cloudant_instance" { name = "crossplanecloudant" location = "us-south" plan = "lite" } variable "ibmcloud_api_key" { type = string } vars: - key: ibmcloud_api_key value: "<Your IBM Cloud API Key>" This provisions a Cloudant DB named crossplanecloudant in IBM Cloud. How Crossplane Fits Into Platform Engineering Platform engineering focuses on building and maintaining internal developer platforms (IDPs) that simplify infrastructure management and application deployment. Crossplane plays a significant role in this by enabling a Kubernetes-native approach. Crossplane ensures declarative, self-service, and policy-driven provisioning of cloud resources. Crossplane features like declarative infrastructure with K8s APIs, custom abstractions for infra and apps, security and compliance guardrails, version-controlled and automated deployments, and continuous drift correction help platform engineering. Conclusion Crossplane transforms how we manage cloud infrastructure by bringing IaC into the Kubernetes ecosystem. Kubernetes APIs enable a truly declarative and GitOps-driven approach to provisioning and managing cloud resources. If you're already using Kubernetes and looking to modernize your IaC strategy, Crossplane is definitely worth exploring.
The Tree of DevEx: Branching Out and Growing the Developer Experience
February 27, 2025 by
Psychological Safety as a Competitive Edge
February 27, 2025
by
CORE
Modern ETL Architecture: dbt on Snowflake With Airflow
February 27, 2025 by
Top Methods to Improve ETL Performance Using SSIS
February 27, 2025 by
Micronaut vs Spring Boot: A Detailed Comparison
February 27, 2025 by
Modern ETL Architecture: dbt on Snowflake With Airflow
February 27, 2025 by
Psychological Safety as a Competitive Edge
February 27, 2025
by
CORE
Top Methods to Improve ETL Performance Using SSIS
February 27, 2025 by
Micronaut vs Spring Boot: A Detailed Comparison
February 27, 2025 by
Handling Embedded Data in NoSQL With Java
February 27, 2025
by
CORE
Cloud-Driven Analytics Solution Strategy in Healthcare
February 27, 2025 by
The Tree of DevEx: Branching Out and Growing the Developer Experience
February 27, 2025 by
The Outbox Pattern: Reliable Messaging in Distributed Systems
February 27, 2025 by
Cloud-Driven Analytics Solution Strategy in Healthcare
February 27, 2025 by
Micronaut vs Spring Boot: A Detailed Comparison
February 27, 2025 by
Handling Embedded Data in NoSQL With Java
February 27, 2025
by
CORE
Efficient Multimodal Data Processing: A Technical Deep Dive
February 27, 2025 by