EITC/IS/CCTF Computational Complexity Theory Fundamentals is the European IT Certification programme on theoretical aspects of foundations of computer science which are also a basis of classical asymmetric public-key cryptography vastly used in the Internet.
The curriculum of the EITC/IS/CCTF Computational Complexity Theory Fundamentals covers theoretical knowledge on foundations of computer science and computational models upon basic concepts such as deterministic and nondeterministic finite state machines, regular languages, context free grammers and languages theory, automata theory, Turing Machines, decidability of problems, recursion, logic and complexity of algorithmics for fundamental security applications within the following structure, encompassing comprehensive video didactic content as a reference for this EITC Certification.
An algorithm’s computational complexity is the amount of resources required to operate it. Time and memory resources are given special attention. The complexity of a problem is defined as the complexity of the best algorithms for solving it. Analysis of algorithms is the study of the complexity of explicitly given algorithms, whereas computational complexity theory is the study of the complexity of problems solutions with best known algorithms. Both domains are intertwined because an algorithm’s complexity is always an upper constraint on the complexity of the problem it solves. Furthermore, it is frequently necessary to compare the complexity of a certain algorithm to the complexity of the problem to be solved while constructing efficient algorithms. In most circumstances, the only information available regarding a problem’s difficulty is that it is less than the complexity of the most efficient known techniques. As a result, there is a lot of overlap between algorithm analysis and complexity theory.
Complexity theory plays an important not only in foundations of computational models as basis for computer science but also in foundations of classical asymmetric cryptography (so called public-key cryptography) which is widely disseminated in modern networks, especially in the Internet. The public-key encryption is based on computational difficult of certain asymmetric mathematical problems such as for example factorization of large numbers into its prime factors (this operation is a hard problem in the complexity theory classification, because there are not known efficient classical algorithms to solve it with resources scaling polynomially rather than exponentially with the increase of the problem’s input size, which is in contrast to a very simple reverse operation of multiplying to known prime factors to give the original large number). Using this asymmetry in an architecture of the public-key cryptography (by defining a computationally asymmetric relation between the public key, that can be easily computed from a private key, while the private key cannot be easily computer from a public key, one can publicly announce the public key and enable other communication sides to use it for asymmetric encryption of data, which can then only be decrypted with the coupled private key, not available computationally to third parties, thus making the communication secure).
The computational complexity theory was developed mainly on achievements of computer science and algorithmics pioneers, such as Alan Turing, whose work was critical to breaking the Enigma cipher of Nazi Germany, which played a profound role in allies winning the Second World War. Cryptanalysis aiming at devising and automating the computational processes of analyzing data (mainly encrypted communication) in order to uncover the hidden information was used to breach cryptographic systems and gain access to the contents of encrypted communication, usually of strategic military importance. It was also cryptanalysis which catalyzed development of first modern computers (which were initially applied to a strategical goal of codebreaking). The British Colossus (considered as the first modern electronic and programme computer) was preceded by the Polish “bomb”, an electronic computational device designed by Marian Rejewski to assist in breaking Enigma ciphers, and handed over to Great Britain by the Polish intelligence along with the captured German Enigma encryption machine, after Poland was invaded by Germany in 1939. On the basis of this device Alan Turing developed its more advanced counterpart to successfully break German encrypted communication, which has been later developed into modern computers.
Because the amount of resources required to run an algorithm varies with the size of the input, the complexity is usually expressed as a function f(n), where n is the input size and f(n) is either the worst-case complexity (the maximum amount of resources required across all inputs of size n) or the average-case complexity (the average of the amount of resources over all inputs of size n). The number of required elementary operations on an input of size n is commonly stated as time complexity, where elementary operations are believed to take a constant amount of time on a particular computer and change only by a constant factor when run on a different machine. The amount of memory required by an algorithm on an input of size n is known as space complexity.
Time is the most usually considered resource. When the term “complexity” is used without qualifier, it usually refers to the complexity of time.
The traditional units of time (seconds, minutes, and so on) are not employed in complexity theory since they are too reliant on the computer chosen and the advancement of technology. For example, a computer today can execute an algorithm substantially quicker than a computer from the 1960s, yet, this is due to technological breakthroughs in computer hardware rather than an inherent quality of the algorithm. The goal of complexity theory is to quantify the inherent time needs of algorithms, or the fundamental time limitations that an algorithm would impose on any computer. This is accomplished by counting how many basic operations are performed during the computation. These procedures are commonly referred to as steps because they are considered to take constant time on a particular machine (i.e., they are unaffected by the amount of the input).
Another crucial resource is the amount of computer memory required to perform algorithms.
Another often used resource is the amount of arithmetic operations. In this scenario, the term “arithmetic complexity” is used. The time complexity is generally the product of the arithmetic complexity by a constant factor if an upper constraint on the size of the binary representation of the numbers that occur during a computation is known.
The size of the integers utilized during a computation is not constrained for many methods, and it is unrealistic to assume that arithmetic operations require a fixed amount of time. As a result, the time complexity, also known as bit complexity, may be significantly higher than the arithmetic complexity. The arithmetic difficulty of computing the determinant of a nn integer matrix, for example, is O(n^3) for standard techniques (Gaussian elimination). Because the size of the coefficients might expand exponentially during the computation, the bit complexity of the same methods is exponential in n. If these techniques are used in conjunction with multi-modular arithmetic, the bit complexity can be decreased to O(n^4).
The bit complexity, in formal terms, refers to the number of operations on bits required to run an algorithm. It equals the temporal complexity up to a constant factor in most computation paradigms. The number of operations on machine words required by computers is proportional to the bit complexity. For realistic models of computation, the time complexity and bit complexity are thus identical.
The resource that is often considered in sorting and searching is the amount of entries comparisons. If the data is well arranged, this is a good indicator of the time complexity.
On all potential inputs, counting the number of steps in an algorithm is impossible. Because the complexity of an input rises with its size, it is commonly represented as a function of the input’s size n (in bits), and so the complexity is a function of n. For the same-sized inputs, however, the complexity of an algorithm can vary substantially. As a result, a variety of complexity functions are routinely employed.
The worst-case complexity is the sum of all complexity for all size n inputs, while the average-case complexity is the sum of all complexity for all size n inputs (this makes sense, as the number of possible inputs of a given size is finite). When the term “complexity” is used without being further defined, the worst-case time complexity is taken into account.
The worst-case and average-case complexity are notoriously difficult to calculate correctly. Furthermore, these exact values have little practical application because any change in machine or calculation paradigm would vary the complexity slightly. Furthermore, resource usage is not crucial for small values of n, therefore ease of implementation is often more appealing than low complexity for small n.
For these reasons, most attention is paid to the complexity’s behavior for high n, that is, its asymptotic behavior as n approaches infinity. As a result, large O notation is commonly used to indicate complexity.
Computational models
The choice of a computation model, which consists of specifying the essential operations that are performed in a unit of time, is crucial in determining the complexity. This is sometimes referred to as a multitape Turing machine when the computation paradigm is not specifically described.
A deterministic model of computation is one in which the machine’s subsequent states and the operations to be performed are entirely defined by the previous state. Recursive functions, lambda calculus, and Turing machines were the first deterministic models. Random-access machines (also known as RAM-machines) are a popular paradigm for simulating real-world computers.
When the computation model isn’t defined, a multitape Turing machine is usually assumed. On multitape Turing machines, the time complexity is the same as on RAM machines for most algorithms, albeit considerable attention in how data is stored in memory may be required to achieve this equivalence.
Various choices may be made at some steps of the computation in a non-deterministic model of computing, such as non-deterministic Turing machines. In complexity theory, all feasible options are considered at the same time, and non-deterministic time complexity is the amount of time required when the best choices are always made. To put it another way, the computation is done concurrently on as many (identical) processors as are required, and the non-deterministic computation time is the time taken by the first processor to complete the computation. This parallelism can be used in quantum computing by using superposed entangled states when running specialized quantum algorithms, such as Shor’s factorization of tiny integers for example.
Even if such a computation model is not currently practicable, it has theoretical significance, particularly in relation to the P = NP problem, which asks if the complexity classes produced by using “polynomial time” and “non-deterministic polynomial time” as least upper bounds are same. On a deterministic computer, simulating an NP-algorithm requires “exponential time.” If a task can be solved in polynomial time on a non-deterministic system, it belongs to the NP difficulty class. If an issue is in NP and is not easier than any other NP problem, it is said to be NP-complete. The Knapsack problem, the traveling salesman problem, and the Boolean satisfiability problem are all NP-complete combinatorial problems. The most well-known algorithm has exponential complexity for all of these problems. If any of these issues could be solved in polynomial time on a deterministic machine, then all NP problems could be solved in polynomial time as well, and P = NP would be established. As of 2017, it is widely assumed that P NP, implying that the worst situations of NP problems are fundamentally difficult to solve, i.e., take far longer than any feasible time span (decades) given interesting input lengths.
Parallel and distributed computing
Parallel and distributed computing involve dividing processing across multiple processors that all operate at the same time. The fundamental distinction between the various models is the method of sending data between processors. Data transmission between processors is typically very quick in parallel computing, whereas data transfer between processors in distributed computing is done across a network and is thus substantially slower.
A computation on N processors takes at least the quotient by N of the time it takes on a single processor. In reality, because some subtasks cannot be parallelized and some processors may need to wait for a result from another processor, this theoretically ideal bound will never be attained.
The key complexity issue is thus to develop algorithms so that the product of computing time by the number of processors is as close as possible to the time required to perform the same computation on a single processor.
Quantum computation
A quantum computer is a computer with a quantum mechanics-based computation model. The Church–Turing thesis holds true for quantum computers, implying that any issue that a quantum computer can solve may also be solved by a Turing machine. However, some tasks might theoretically be solved using a quantum computer rather than a classical computer with a significantly lower temporal complexity. For the time being, this is strictly theoretical, as no one knows how to develop a practical quantum computer.
Quantum complexity theory was created to investigate the different types of issues that can be solved by quantum computers. It’s utilized in post-quantum cryptography, which is the process of creating cryptographic protocols that are resistant to quantum computer assaults.
Complexity of the problem (lower bounds)
The infimum of the complexities of the algorithms that may solve the problem, including undiscovered techniques, is the complexity of the problem. As a result, the complexity of a problem is equal to the complexity of any algorithm that solves it.
As a result, any complexity given in large O notation represents a complexity of both the algorithm and the accompanying problem.
On the other hand, obtaining nontrivial lower bounds on issue complexity is often difficult, and there are few strategies for doing so.
In order to solve most issues, all input data must be read, which takes time proportionate to the magnitude of the data. As a result, such problems have at least linear complexity, or, in big omega notation, a complexity of Ω(n).
Some problems, such as those in computer algebra and computational algebraic geometry, have very large solutions. Because the output must be written, the complexity is constrained by the maximum size of the output.
The number of comparisons required for a sorting algorithm has a nonlinear lower bound of Ω(nlogn). As a result, the best sorting algorithms are the most efficient since their complexity is O(nlogn). The fact that there are n! ways to organise n things leads to this lower bound. Because each comparison divides this collection of n! orders into two pieces, the number of N comparisons required to distinguish all orders must be 2N > n!, implying O(nlogn) by Stirling’s formula.
Reducing a problem to another problem is a common strategy for obtaining reduced complexity constraints.
Algorithm development
Evaluating an algorithm’s complexity is an important element of the design process since it provides useful information about the performance that may be expected.
It is a frequent misunderstanding that, as a result of Moore’s law, which predicts the exponential growth of modern computer power, evaluating the complexity of algorithms will become less relevant. This is incorrect because the increased power allows for the processing of massive amounts of data (big data). For example, any algorithm should function well in less than a second when sorting alphabetically a list of a few hundreds of entries, such as the bibliography of a book. On the other hand, for a million entries (for example, the phone numbers of a large city), the basic algorithms that require O(n2) comparisons would have to perform a trillion comparisons, which would take three hours at a speed of 10 million comparisons per second. Quicksort and merge sort, on the other hand, only require nlogn comparisons (as average-case complexity for the former, as worst-case complexity for the latter). This produces around 30,000,000 comparisons for n = 1,000,000, which would take only 3 seconds at 10 million comparisons per second.
As a result, assessing complexity may allow for the elimination of many inefficient algorithms prior to implementation. This can also be used to fine-tune complex algorithms without having to test all possible variants. The study of complexity allows focusing the effort for increasing the efficiency of an implementation by determining the most costly steps of a complex algorithm.
To acquaint yourself in-detail with the certification curriculum you can expand and analyze the table below.
The EITC/IS/CCTF Computational Complexity Theory Fundamentals Certification Curriculum references open-access didactic materials in a video form. Learning process is divided into a step-by-step structure (programmes -> lessons -> topics) covering relevant curriculum parts. Unlimited consultancy with domain experts are also provided.
For details on the Certification procedure check How it Works.
Primary supportive curriculum reading materials
S. Arora, B. Barak, Computational Complexity: A Modern Approach
https://theory.cs.princeton.edu/complexity/book.pdf
Secondary supportive curriculum reading materials
O. Goldreich, Introduction to Complexity Theory:
https://www.wisdom.weizmann.ac.il/~oded/cc99.html
Supportive curriculum reading materials on discrete mathematics
J. Aspnes, Notes on Discrete Mathematics:
https://www.cs.yale.edu/homes/aspnes/classes/202/notes.pdf
Supportive curriculum reading materials on graph theory
R. Diestel, Graph Theory:
https://diestel-graph-theory.com/
Download the complete offline self-learning preparatory materials for the EITC/IS/CCTF Computational Complexity Theory Fundamentals programme in a PDF file
EITC/IS/CCTF preparatory materials – standard version
EITC/IS/CCTF preparatory materials – extended version with review questions