Professor: Elizabeth Arvelo-Reyes

Md Afzal Hossain, Juan B Gomez, Nikhil Matlani,
Prathyusha Guduru, Md Talimul Islam
Palantir Technology
You might have never heard of them and if you have you may know little about them or may think that the company is a top-secret government program, but it really is not. What they are is a complete free meal and gym membership. They are a privately founded company and its main headquarters is located in Silicon Valley. The company was started about a decade ago in 2004 by Dr. Karp, and former friends who help make the famous PayPal. Dr. Karp holds a PHD in social theory. Palantir Technology has become very specialize in becoming a big data analytics company. So what lead them to the big data, they figure that most data was fragmented and unstructured, so they build a tool where they can combine all this data and make it easier for anyone looking for information on something or someone and made it easy would to gather. But before the CIA came knocking on the door the company was struggling to find its own direction until they decide to focus on the intelligence community. By working alongside the CIA the company made a lot of breakthroughs that help the company form some of the key advantages of today. They might have started with the CIA but that doesn’t mean that they are the only company Palantir is still working with. They have been reported to be working alongside many other agencies from federal to private and city.

Purpose of software
Palantir Technology basic purpose when they started the software was bringing together massive of information that the human eye might not be able to collect and patterns that was complicated for humans to retrieve. There software was deployed to counter attack cybercriminals and natural disasters and the ugly byproducts of war oversea. Commercial customers all rely on Palantir to be able to detect fraud, study of consumer behavior and search for the next competitive edge. The intelligence community use Palantir tools to flag suspicious activities, tracking the movement of money, contraband and bad operators.

System First implemented

Palantir is an American private software company and they are expert in analyze big data. In the early 2000s, PayPal developed an analysis software system which is called Igor and it was for identify and stop cyber fraud. The company Palantir start their journey in 2004 and from their they also start implement their system for the demand base.
At their initial level Palantir try to implement their software similar to PayPal’s fraud recognition systems to reduce terrorism while preserving civil liberties. And then when they focus on more security and client like big government agency, they start focusing on data pattern to delivered structured information.
Palantir software system provide user easily data exported from Palantir raw formats to another system as an analytical product. And Palantir provides built-in tools for users to export data in formats that include:
? Raw data exports (XML)
? Dossier views and reports (HTML, Word document reports)
? Data summaries (Excel or CSV spreadsheets)
? Investigation history and graph visualization (PowerPoint)
? Geospatial data (OGC, WMS, KML)

Palantir Technologies offers two primary value proposition and this are accessibility and customization. Palantir product are used for private, public and non-profit organization and this system could use for wide range of application such as:
? Anti-Fraud
? Case Management
? Cyber Security
? Disaster Preparedness
? Healthcare Delivery
? Insurance Analytics
? Law Enforcement
? Trader Oversight ? Capital Market
? Crisis Response
? Defense
? Disease Response
? Insider Threat
? Intelligence
? Legal Intelligence
? Pharma R&D

Palantir provides a utility to export any or all of the contents of the data repository in XML format. Exports can include all integrated data, including all data imported through the back end, front end, or created directly by users within the application. Once exported, the data can be ingested by any system that can parse XML files.
Palantir is deployed across hundreds of organizations, including agencies within the US Department of Defense, intelligence community, and law enforcement community, as well as with allied foreign governments and large commercial organizations. Palantir’s APIs can be used to extend the platform to interoperate with third-party services, such as entity extractors or natural language processing tools. Palantir can also be extended to accommodate new and unforeseen challenges through the development of custom helpers, applications, and other tools. By augmenting your existing technology investments, rather than trying to replace them, Palantir enhances the value and capability of a diverse system landscape.
So, if we say in simple word, Palantir doing the task which is not possible by all organization or using their software facility. They simply working as backup tools and providing necessary support so that user of the company can access the information easily in a very structured and organize manner meaning that Palantir tools help to analyze data and find a very simple and visual understanding which is not possible by any other organization or in the company software system.
Palantir system and components

Palantir company build customized software platforms for companies, government agencies, and the military to help them integrate their enormous and disparate sets of data to search through huge data sets at lightning speed. Palantir engineers work with clients to build the software platform through direct interaction. Palantir customized Software involves the development, and release of the software designed for a specific user or a business. Palantir products are used throughout the public, private, and non-profit sectors to help organizations quickly implement solutions to the hardest problems they face.
Types of Domain’s Palantir is working with
Domain Use
Accelerate automotive quality improvements and increase safety by integrating data across a vehicle’s lifecycle.
Implement a flexible solution for collaborative investigations and general business process management.
Return to security first principles and strengthen cybersecurity hygiene with a comprehensive and dynamic network view.
Synchronize multi-INT data for analysis across staff functions and locations, from reach back facilities to the edge of the battlefield.
Treat financial compliance as a data problem and solve it with a comprehensive platform to accelerate the business.
Strengthen trust in how your enterprise uses information — and achieve the potential of your digital transformation.
Integrate and analyze health data to identify fraudulent schemes, improve patient outcomes, and identify meaningful trends.
Efficiently, effectively, and securely exploit and analyze data to drive more informed operational planning and strategic decision-making.
Equip agencies with the intelligence, investigations, case management, and reporting tools they need to respond to crime as it happens.
Discover patterns of behavior and links between key actors that lie hidden in vast amounts of structured and unstructured data
Join the fourth industrial revolution by harnessing data across the value chain, including supply chain, production, and operations data.
Accelerate realization of deal synergies while maintaining a flexible systems strategy.
Improve the fidelity of drug development with real-world data and maintain rigorous standards of patient data privacy.
Use data from Airbus, suppliers, airlines, and your global fleet to reduce delays, lower operating costs, and improve the passenger experience.
Protect your organization’s sensitive information and intellectual property from theft, misuse, and abuse.

Palantir is used by companies like CIA, FBI, military, hedge funds, and retailers because it provides premier data analytics products that have really become an essential in today’s world. Palantir has two different products, which are Palantir Gotham and Palantir Foundry.

– Starts with data from multiple sources
– Integrate, manage, secure, and analyze unstructured data
– Connects the dots – Works with Structured data and unstructured data
– Gives visual representation for your data to understand the trends
– Create various types of tables, applications, reports, presentations, and spreadsheets

Palantir’s software is used pull out the required data from the users using normal language. You don’t even need any type of expertise or hire a person who can query your data. All the results will be returned in real time. What Palantir’s software does is that it stays on the top layers of the existing data sets to let the users have a dramatic interface. What is so great about Palantir is that you can control your data and own it. It uses the Open Data Formats and lets you export our data in any way.


There are three top architectural components of Palantir system are:

A. The Palantir Access Point
B. The Palantir Gateways
C. Monitoring and Information Systems modules
A. The Palantir Access Point:
Palantir Meta-Information System access point is the very first layer of Palantir. It controls all the queries inputted by the users and send them to the right Palantir gateway. There are two ways these queries can go Monitoring information is queried, where it requires the client to provide w the information that need to be gathered. Discovery Information is where the client doesn’t have to be aware of the final queries that are done by MDS.
B. The Palantir Gateways:
The second layer, Palantir Gateways is located on the local centers. This layer is responsible to send the queries to the information system and monitoring system modules. The query needs to be processed by the right module. Palantir Gateways keeps the entire database and when needed it searches in these databases and retrieves the data accordingly. There are two types of entities.
1. Persistent entities. Entities are available for a long term. Entities like this in the database can be stored with other data where the information systems are located.
2. Non-persistent entities, these types of entities are very limited and short term. This kind of entity has a composed key which has the information about where these entities can be found. Contains information about where they can be found.
C. Monitoring and Information Systems modules:
The third layer has the responsibility for Modules. I send the final queries to these systems accordingly. With a standard layout of format of queries presented to the module, it will decide on what which section to look for and respond back.
Palantir compliantly and consistently works on enormous variety of projects in complex real-world problems and denote an incredible amount data for the use of their clients and partners. When Palantir collects data from its clients at first place they create data structure such a simple way that the clients or owner of data can use the data without difficulties. They do it in purpose of future use so that the data can be traced back to the source if required. Palantir is able to have unique access to huge problems with the resources to build solutions. Palantir world’s most advanced products tackling flat structure that boosts and nurtures initiative.
Concerning security reasons, many IT company use the closed system but Palantir makes their technology as an embedded part for the clients based on the clients’ businesses so that the clients can have interactive view at data, customize and analyze different parts of data which helps clients to have their own decisions in necessary issues. Palantir is also stepping forward to use non-propriety data format and carry across the data by API or Application Programming Interface. Palantir goes through hands-on interactive process for accessing collected data in the system. The strength here is the Palantir system combines data from different sources like client itself, clients’ customers, and clients’ transaction in banking to ensure the secure procedures. To combine and store queried data, Palantir respects the privacy of clients and data users as well.
Palantir Site Reliability Engineers are skilled to work any sites of the company for example in the field with clients or customers, in development team, infrastructure advancing team, and in products reliability team. The tech stack is extraordinarily multifaceted and involved with a lot of open source and proprietary systems, SREs are independent to operate autonomously and are often encouraged to innovate in a lot of different ways.
Palantir use fraud recognition system by focusing on security when serve the government agencies like CIA, FBI, Police department and so on. Palantir customized Software supports organizations fast to implement solutions to the rigid problems they face because the software is designed to counter attack cybercriminals and natural disasters. Palantir’s business clients depends on them to detect fraud, customers’ behavior and search for the next competitive edge. The intelligence community use Palantir tools to flag suspicious activities. Palantir system ensures the security first and reinforce cybersecurity hygiene with a complete and dynamic network view. Its defensive system synchronizes multi-INT data to analyze staff functions and locations. Palantir intelligence is very efficient, effective, and secured to exploit and analyze data to drive more up-to-date operational design and strategic decision-making. Palantir is able to protect organization’s sensitive information and intellectual property from theft, misuse, and abuse.
As huge amount of work is done so flat structure and autonomy makes a disordered environment that’s very much by design. Because of working in an embedded volume with toxic organizations, this contaminates the system. Palantir is not protected to the all-pervasive tension between Product Development and Business Development.
Palantir’s weakness is poor product marketing because the products are not affordable for many companies. Usually government and huge companies use their products. Palantir doesn’t have much efforts in marketing which gives an image that they ignore many businesses therefore this affects their business. As Palantir big data company, they deny working with some companies. Even if they work some companies, they check them very tightly and put conditions to have full access in the clients’ database.
Palantir database are accessible by many clients so it makes them vulnerable in the sense that they can be used in the by other parties for the destruction of a company. As all big databases especially one that is involved with security. Palantir information are a target for terrorists and can be used against a company for various reasons even causing National security problems.
The last but not least weakness can be data leak. External or intentional threats are malware, hacking, social engineering, Trojans and virus can cause data leak. Internal or unintentional threats are configuration error, improper encryption, accidently publishing, and privilege abuse also can casus data leak.

Palantir Internal/External Data Breach Mitigation Plan
Palantir’s operations include mining huge amounts of data and finding patterns that would be missed by human eyes.
The threat we are dealing with is Internal and External Data Breach. The following chart will help us classify internal and external data breaches and hence provide us with better understanding of the threats.

One approach to the classification of data leak threats is based on their causes, either deliberately or inadvertently leaking sensitive data. Another approach is primarily based on which events triggered the leakage: insider or outsider threats. As shown in the chart above 1, intentional leaks occur because of both external events and malicious insiders. External data breaches are commonly caused by hacker break?ins, malware, virus, and social engineering. as an example, an adversary may exploit a system backdoor or misconfigured access controls to pass a server’s authentication mechanism and benefit access to sensitive information. Social engineering (e.g., phishing) attacks emerge as increasingly sophisticated against companies, by way of fooling employees and individuals into handing over valuable organization data to cyber criminals. Internal information leakage may be caused by either planned movements (e.g., due to espionage for economic reward or employee grievances) or inadvertently mistakes (e.g., unintended statistics sharing by employees or transmitting personal data without proper encryption.

DLPD approach:

Content?based DLPD searches known sensitive information that resides on laptops, servers, cloud storage, or from outbound network traffic, which is largely dependent on data fingerprinting, lexical content analysis (e.g., rule?based and regular expressions), or statistical analysis of the monitored data. In data fingerprinting, signatures (or keywords) of known sensitive content are extracted and compared with content being monitored in order to detect data leaks, where signatures can either be digests or hash values of a set of data. Fingerprinting is a method that extracts fingerprints from the core confidential content while ignoring non-relevant (nonconfidential) parts of a document, to improve the robustness to the rephrasing of confidential content. Lexical analysis is used to find sensitive information that follows simple patterns. For example, regular expressions can be used for detecting structured data including social security numbers, credit card numbers, medical terms, and geographical information in documents.
Statistical analysis mainly involves analyzing the frequency of shingles/n?grams, which are typically fixed?size sequences of contiguous bytes within a document. Another line of research includes the item weighting schemes and similarity measures in statistical analysis, where item weighting assigns different importance scores to items (i.e., n?grams), rather than treating them equally.
Collection intersection is a commonly used statistical analysis method in detecting the presence of sensitive data. Two collections of shingles are compared, and the similarity score is computed between content sequences being monitored and sensitive data sequences that are not allowed to leave enterprise networks.
Symantec Data Loss Prevention on Enforce Platform
Requirement Satisfying Key Features
Comprehensive data detection technology
Enforce Platform accurately detects all types of confidential data, wherever it is stored or used. It uses three classes of detection technologies to provide complete and accurate coverage across your endpoint, network, and storage systems:
Describing protects structured and unstructured data by looking for content matches on keywords, expressions or patterns, and signatures. Fingerprinting protects structured and unstructured data by looking for exact or partial content matches on indexed data sources and documents. Learning protects unstructured textual data by building a statistical model using example documents and calculating content similarity.
Describing, fingerprinting, and learning are critical to achieving high detection accuracy and preventing data loss. Without accuracy, your Data Loss Prevention system will generate numerous false positives and negatives. False positives waste limited time and resources on investigating and resolving incidents that do not actually violate security policy. False negatives obscure gaps in security by allowing data loss and putting your organization at risk of a breach.
Proven policy authoring and tuning framework
Enforce Platform enables you to write policies once and enforce them everywhere. With our out-of-the box policy templates, data identifiers, and solution packs, you can quickly and easily reduce data loss risk starting on day one.
Comprehensive reporting and remediation workflow
Ninety percent of data loss prevention (DLP) is about what you do after you find sensitive data. Enforce Platform is where you can view reports, automatically notify users of violations, and enable automatic workflow responses.
Comprehensive incident reporting and remediation enables you to take the right action at the right time and communicate data loss risk to your business units.

