Key Features of Teradata
The following are the key features of Teradata database,
- Single data store
- Unconditional parallelism (parallel architecture)
- Ability to model the business
- Mature, parallel-aware Optimizer
Single Data Store
The Teradata Database acts as a single data store, with multiple client applications making inquiries against it concurrently.
Instead of replicating a database for different purposes, with the Teradata Database you store the data once and use it for many applications. The Teradata Database provides the same connectivity for an entry-level system as it does for a massive enterprise data warehouse.
“Linear scalability” means that as you add components to the system, the performance increase is linear. Adding components allows the system to accommodate increased workload without decreased throughput. Linear scalability enables the system to grow to support more users/data/queries/complexity of queries without experiencing performance degradation. As the configuration grows, performance increase is linear, slope of 1. The Teradata Database was the first commercial database system to scale to and support a trillion bytes of data.
The chart below lists the meaning of the prefixes:
The Teradata Database can scale from 100 gigabytes to over 100+ petabytes of data on a single system without losing any performance capability. The Teradata Database’s scalability provides investment protection for customer’s growth and application development. The Teradata Database is the only database that is predictably scalable in multiple dimensions, and this extends to data loading with the use of parallel loading utilities. The Teradata Database provides automatic data distribution and no reorganizations of data are needed. The Teradata Database is scalable in multiple ways, including hardware, query complexity, and number of concurrent users.
Growth is a fundamental goal of business. An MPP system easily accommodates that growth whenever it happens. The Teradata Database runs on highly optimized Teradata servers in the following configurations:
- SMP – Symmetric multiprocessing platforms manage terabytes of data to support an entry-level data warehousing system.
- MPP – Massively parallel processing systems can manage hundreds of petabytes of data. You can start with a couple of nodes, and later expand the system as your business grows.
With the Teradata Database, you can increase the size of your system without replacing:
- Databases – When you expand your system, the data is automatically redistributed through the reconfiguration process, without manual interventions such as sorting, unloading and reloading, or partitioning.
- Platforms – The modular structure allows you to add components to your existing system.
- Data model – The physical and logical data models remain the same regardless of data volume.
- Applications – Applications you develop for Teradata Database configurations will continue to work as the system grows, protecting your investment in application development.
The Teradata Database is adept at complex data models that satisfy the information needs throughout an enterprise. The Teradata Database efficiently processes increasingly sophisticated business questions as users realize the value of the answers they are getting. It has the ability to perform large aggregations during query run time and can perform up to 128 joins in a single query.
As is proven in every Teradata Database benchmark, the Teradata Database can handle the most concurrent users, who are often running multiple, complex queries. The Teradata Database has the proven ability to handle from hundreds to thousands of users on the system simultaneously. Adding many concurrent users typically reduces system performance. However, adding more components can enable the system to accommodate the new users with equal or even better performance.
The Teradata Database provides exceptional performance using parallelism to achieve a single answer faster than a non-parallel system. Parallelism uses multiple processors working together to accomplish a task quickly.
An example of parallelism can be seen at an amusement park, as guests stand in line for an attraction such as a roller coaster. As the line approaches the boarding platform, it typically will split into multiple, parallel lines. That way, groups of people can step into their seats simultaneously. The line moves faster than if the guests step onto the attraction one at a time. At the biggest amusement parks, the parallel loading of the rides becomes essential to their successful operation.
Parallelism is evident throughout a Teradata Database, from the architecture to data loading to complex request processing. The Teradata Database processes requests in parallel without mandatory query tuning. The Teradata Database’s parallelism does not depend on limited data quantity, column range constraints, or specialized data models — The Teradata Database provides “unconditional parallelism meaning that there are no serial bottlenecks.”
Teradata supports ad-hoc queries using ANSI-standard SQL which allows Teradata to interface with 3rd party Business Intelligence (BI) tools and submit queries from other database systems.
Ability to Model the Business
A data warehouse built on a contains information from across the enterprise. Individual departments can use their own assumptions and views of the data for analysis, yet these varying perspectives have a common basis for a “single view of the business.”
With the Teradata Database’s centrally located, logical architecture, companies can get a cohesive view of their operations across functional areas to:
- Find out which divisions share customers.
- Track products throughout the supply chain, from initial manufacture, to inventory, to sale, to delivery, to maintenance, to customer satisfaction.
- Analyze relationships between results of different departments.
- Determine if a customer on the phone has used the company’s website.
- Vary levels of service based on a customer’s profitability.
You get consistent answers from the different viewpoints above using a single business model, rather than functional models for different departments. In a functional model, data is organized according to what is done with it. But what happens if users later want to do some analysis that has never been done before? When a system is optimized for one department’s function, the other departments’ needs (and future needs) may not be met.
A Teradata Database models a customer’s business with data organized according to what it represents, not how it is accessed, so it is easy to understand. The data model should be designed without regard to usage and be the same regardless of data volume. With a Teradata Database as the enterprise data warehouse, users can ask new questions of the data that were never anticipated, throughout the business cycle and even through changes in the business environment.
A key Teradata Database strength is its ability to model the customer’s business. The Teradata Database supports business models that are truly normalized, avoiding the costly star schema and snowflake implementations that many other database vendors use. The Teradata Database can support star schema and other types of relational modeling, but Third Normal Form is the method for relational modeling that we recommend to customers. Our competitors typically implement star schema or snowflake models either because they are implementing a set of known queries in a transaction processing environment, or because their architecture limits them to that type of model. Normalization is the process of reducing a complex data structure into a simple, stable one. Generally this process involves removing redundant attributes, keys, and relationships from the conceptual data model. The Teradata Database supports normalized logical models because it is able to perform 128 table joins and large aggregations during queries.
Mature, Parallel-Aware Optimizer
The Teradata Database Optimizer is the most robust in the industry, able to handle:
- Multiple complex queries
- Multiple joins per query
- Unlimited ad-hoc processing
The Optimizer is parallel-aware, meaning that it has knowledge of system components (how many nodes, vprocs, etc.). It determines the least expensive plan (time-wise) to process queries fast and in parallel. The Optimizer is further explained in the next module.