20071219

Answer of yesterday's game

Although it is not 24 hours yet, I still want to give the answer and solution to you.
As some of you may know, what I did is to hide the most significant 4-bits/ 3-bits of an image to the least significant bits of each pixel of the photo. It is the reason why the photo did not change much after this modification. The reverse is also very intuitive, I will leave it to you.

Here is the image hidden in each photo.


EncodeDecode

photos taken by Sam Rohn - Location Scout

Here is my source code, which is written in python. If you want to run this code, you need to have the package Python Imaging Library (PIL).

Reference:
http://en.wikipedia.org/wiki/Steganography
http://www.pythonware.com/products/pil/
http://petitcolas.net/fabien/steganography/
http://utilitymill.com/utility/Steganography_Encode

20071217

Steganography Study

I have promised to give a funny game yesterday. However, actually, it is not that easy to make the game. This game does not have a very strict rule, and it is not really a game. This game just want to give an opportunity for me and you to get more familiar with steganography. I will provide three photos below, and what you need to do is try to get back the images hidden in these photos. I will give the solutions and the program to fetch the hidden images tomorrow.





photos taken by Sam Rohn - Location Scout
Hint: 4-bit/ 3-bit pixels of the hidden image is inside the given photos.

20071216

Cryptographic Techniques

Today, I would like to discuss some technical terms which are related to cryptography. Cryptography is the art of achieving security by encoding messages to make them non-readable. In general, we say that a message need to be encode is called plain text, while the encoded message is called cipher text. There are two primary ways to encode a plain text message, Substitution and Transposition.

The earliest substitution scheme is called Caesar Cipher, proposed by Julius Caesar. This encryption algorithm is to replace each alphabet in the message with the alphabet three places down the line. (ie. A replaced by D, B replaced by E, ... Z replaced by C).
For further information about substitution techniques, please visit Substitution cipher.

Rail Fence Technique is an example of transposition. This algorithm first write down the plain text message as a sequence of diagonals, then read the plain text written as a sequence of rows. eg. Original plain text message: what the hell you are doing
W A T E E L O A E O N
 H T H H L Y U R D I G
and the result cipher text is "wateeloaeonhthhlyurdig". Other transposition technique is quite similar to Rail Fence Technique, here is more examples of Transposition cipher.

Every encryption and decryption process has two aspects, the algorithm and the key used for encryption and decryption. There are two cryptographic mechanisms, depending on what keys are used. Symmetric Key Cryptography, is the mechanisms that using the same key for both encryption and decryption. In opposite, the mechanisms use different keys for encryption and decryption is called Asymmetric Key Cryptography.

In symmetric key cryptography, the problem is key distribution. Since the key is to decrypt the message in the receiver side, how can the key distributed securely to receiver? Whitefield Diffie and Martin Hellman developed an amazing solution for key distribution problem in 1976. The solution is called Diffie-Hellman key exchange algorithm. The algorithm is very easy to understand, here is how it works. First, if A and B want to exchange message, they first come up with two large prime numbers, n and g. These two numbers need not be kept secret. Then A and B choose two large random number x and y respectively, then calculate two numbers vA = gx mod n, vB = gy mod n respectively. After that, A and B exchange vA and vB. At last, A will get the secret K1 = vBx mod n, B will get the secret K2 = vAy mod n. At last, K1 is equal to K2, and the key has been exchanged. However, this algorithm raises another problem, it is caused by the first step. If the third party, user C, intercept the message and get the value n and g. then C can continue the whole process to get further data exchange between A and B, because C holds the keys to communicate to A and B. However, A and B will never know that their data has been hijacked. This problem is called man-in-the-middle attack.

In asymmetric key cryptography, the problems mentioned above will not exist. When A try to communicate with others, it can ask the trusted third party T, to get a pair of key, one is called public key, which will be distributed to those who want to send data A. The other is called private key, which is held by A to decrypt the message. The other name of this cryptography mechanism is called Public-key cryptography.

Another message encoding technique is called steganography. This technique is to hide the message that is to be kept secret inside other messages.

ps. I will provide a fun game tomorrow.

20071215

Cryptography and Network Security

I am reading this book "Cryptography and Network Security" (Seems Amazon does not have this book, please tell me if you find it in Amazon). Seriously, I want to learn network security, and that's why I resume this blog for making notes when I am reading it.
Hope you enjoy and rise questions or comments if you have any.

Chapter 1: Introduction
In chapter 1, it describes some basic principles of security, and attacks related to each principle. There are mainly five principles of security, they are confidentiality, authentication, integrity, availability and non-repudiation.

The principle of confidentiality specifies that only the sender and the intended recipents should be able to access the contents of a message. Confidentiality gets compromised if an unauthorized person is able to get the message. The simple example is, if user A want to send a message to user B, another user C capture the message without permission or knowledge of A and B. This type of attack is called interception.

Authentication mechanisms help establish proof of identities. The authentication ensures that the sender of a message is correctly identified. If user C sends a message to B. However, C has posed as user A when he sent this message to B. User B would not know that the message is come from C, but not A. This kind of attack is called fabrication.

When the contents of a message is changed after the sender sends it, but before it reaches the receiver, then the integrity of this message is lost. For example, user A transfers HKD $100 dollars to user B through the online Banking System. Unfortunately, user C capture the message and edit the content as "Transfer HKD $1000", both user A and B has no way of knowing that the contents were changed during the message was transferring. This attack is called modification.

Non-repudiation does not allow the sender of a message to refute the claim of not sending that message.

The principle of availability states that resources should be available to authorized parties at all times. Due to the intentional actions of an unauthorized user C, an authorized user A may not be able to contact a server B, this is an attack called interruption.

Theoretically, attacks can be divided into two categories, passive attack and active attack. passive attack is not easy to detect because attacker does not attempt to perform any modification to the data. Obviously, active attack is the opposite of passive attack, it is based on modification of the original message, and this kind of attack is easy to detect but difficult to prevent. The following two are examples of each attack.

IP Sniffing is a passive attack on an ongoing conversation. An attacker can simply observe packets they pass by. There are two ways to prevent attackers from sniffing packets. The first is encode the data before sending it, second is the transmission link itself can be encoded.

In IP Spoofing, an attacker sends packets with an incorrect source address. When this happens, the receiver has no way to know that the sender is fake, and he send replies back to the forged address, and not to the attacker. However the attacker can intercept the reply to get information he needs for hijacking attacks, or just want to cause the Denial of Service by these messages.

20070922

New ACM Season

The new ACM Season has been started.
This is my last time to join this competition, I am trying to forget everything and focus on ACM this semester.
Guys, pray for me. I want to go World Final, once only, but not more.

Here is the list I am planning to finish it this month (still have many other problems need to solve next month).
Progress (11/18)
Euler graph *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=2337
Cut Edge: http://acm.pku.edu.cn/JudgeOnline/problem?id=3177
2-connected Component *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=2942
Min degree bounded ST: http://acm.pku.edu.cn/JudgeOnline/problem?id=1639
Min Ratio ST *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=2728
SP *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=3013
Difference Constraint *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=1275
Bellman-Ford *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=1252
Network Flow *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=1459
Network Flow: http://acm.pku.edu.cn/JudgeOnline/problem?id=2391
Bipartite Matching *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=1325
Bipartite Matching *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=2226
Weighted Matching *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=2195
Weighted Matching: http://acm.pku.edu.cn/JudgeOnline/problem?id=2516
Least Common Ancester: http://acm.pku.edu.cn/JudgeOnline/problem?id=1986
2-SAT *DONE*: http://acm.pku.edu.cn/JudgeOnline/problem?id=2723
Reference: http://home.ustc.edu.cn/~zhuhcheng/ACM/2-SAT.PPT
2-SAT: http://acm.pku.edu.cn/JudgeOnline/problem?id=2749
Minimum Spanning Arborescence: http://acm.pku.edu.cn/JudgeOnline/problem?id=3164
Reference: Chu-Liu-Edmonds algorithm

20070529

Covering Rectilinear Polygons by Rectangles

Covering Rectilinear Polygons by Rectangles is a category of problems, the general definition of the problem: A board B is a finite set of unit squares lying in the plane whose corners have integer coordinates. A rectangle of B is a rectangular subset of B. A rectangle cover of B is a collection of rectangles whose union equals B.The rectangles of a cover may overlap, but each of them must be wholly contained in the board.

This problem obviously is NP-Complete, I am still thinking of the proof. The first idea come up is that it is finding maximal cliques on the graph G(B), G(B) is a graph whose nodes represent squares in B. Two square u, v are adjacent in G if Rec(u, v) in B. The other similar NP-complete problem to rectangle covering, is set cover problem.

There are some specialized problems. For example, the given board B must be convex and no hole inside the board.

Since the problem is NP-complete problem, most of the attention is focus on the approximability of the problem. Erdos asked if Theta / Alpha [1] is bounded, but it still cannot be proved yet.

This problem is quite interesting and applicable, one of its application is an operation used by microelectronics industry. A layer of an integrated circuit is to be printed on a photographic plate. The printing is done by flashing rectangles onto the plate and as few rectangles as possible.

[1] Theta is the rectangle of B; Alpha is anti-rectangle of B, ie. a set of squares in B that no two of which are contained in any rectangle.

20070505

A report for PC tree

The beta version of the PC tree report is finished. The topic is "PC tree and its application". This report first discuss the use of PC tree for testing consecutive-ones property of a 0-1 matrix. Then discuss its further application on interval graph recognition and planarity testing.

However, I start writing terribly when describing the application. If you don't understand what they are talking about, please feel free to ask me, I will fix it asap, but not today XD

PC tree and its application

20070428

Online judge security problem

These few days, I am trying to use Python to rewrite the core of a online judging system (I use C to implement it before).
First, let me brief describe what is included in the core system.
  • Compile program

    • distinguish what programming language the user is using; (basically it only support Pascal, C and C++)

    • use the corresponding compiler to compile the source code;

    • limit the size of the objective program it can generate;

    • report the compilation is success or not.

  • Judge program

    • run the compiled objective program with specific test input;

    • limit its running time, memory used when it is executing;

    • restrict its function call (or system call); (more on this later)

    • compare its output with the test output;

    • or, special judge if necessary.
I would like to discuss why we need to restrict its function call. Actually, it is both a security issue and prevent on cheating.
Let's discuss the prevention on cheating first. Obviously, if the user know the judge system architecture, one way to cheat is to open the test output file and print it, then he must get an Accept on the task. Or, if the system has stored other users solution, the cheater may hack in and use the solution as his own, even get a copy.

The most serious problem is that, the hacker may write a program which is trying to damage the operating system (OS). He may delete files, change the ownership of files, even get copy of some private files, then he can do some further attack. How horrible it is!

What most commonly use in C is a function ptrace (if I remember correctly, it only available *nix system). How it works is as follow.
  • ptrace is called in a child process and stating that ptracing on itself;

  • then parent process keep ptracing on its child process, for each signal the child process send out, parent process is first capture it before it send the signal away;

  • at this point, the parent process is trying to recognize what child is doing by reading this signal, if its child is a bad guy, parent will kill him immediately;

  • otherwise, parent will allow the child continue its process.
Actually, ptrace itself is also a very useful tool for hacking program. If the system allow ptrace call, hacker can write a program by using ptrace and stop some of your applications in OS and fetch the private information.

However, I am still trying to figure out how to do the same thing in Python. Actually, I cannot find something in Python which is similar to ptrace.

20070411

PC-tree

Definition:

A circular-arc graph is the intersection graph of a family of arcs of a circle.

Problem definition:

Given a graph G = (V, E), determine whether the graph is a circular-arc graph.

PC-tree is a data structure used to solve the above problem, discovered by Hsu and McConnell. PC-tree is an unrooted tree, its previous version is called PQ-tree which is discovered by Booth and Lueker in 1976. This article will focus on PC-tree only, because PC-tree is generalization of PQ-tree, which means any problem that can be solved by PQ-tree can reduce to PC-tree problems.

Each circular-arc graph can be represented by a clique matrix. Consider circular arc its corresponding circular-arc graph and clique matrix as follow:


By theorem, a graph is circular-arc iff. its clique matrix must a circular-ones property (proof provide later). Now the problem is focusing on how to check a matrix contains circular one property. PC-tree take the place to solve this problem. Let me introduce what is a PC-tree first.

PC-tree has two kinds of internal node, P-node and C-node (that's why it is called PC-tree). The children of P-node can be arranged in any order (ie. all permutations). Other than P-node, the children of C-node can only be arranged in the original order or its reverse (ie. if it's leaves is "ABC", another possible ordering is "CBA"). PC-tree is built by considering the rows of clique matrix one by one. The outline of the algorithm is as follow:
for each row in clique matrix {
  • assign the value to each row symbol;

  • find a Terminal Path to separate 0s and 1s;

  • align all 1s and 0s to the same side;

  • split each node on the path into two nodes, one connect to the leaves with 1s and one connect to 0s;

  • delete the edges of the path and replace with a C-node x;

  • contract all edges from x to C-node neighbors, and any node has only 2 neighbors.

  • }

    The circular-ones ordering can be discovered from the graph by reading the leaves on PC-tree in clockwise or anti-clockwise. Like the following:
    img provide later
    The time complexity of the O(|E|) (proof provide later).

    Further application is: scheduling jobs, minimum coloring, maximum cliques, planarity test, consecutive-ones, ...

    20070409

    Cross-Site Request Forgeries (CSRF/ XSRF)

    Today, I google with the word "XSRF", some of you may know what it is, and it is not new. However, it is very amazing to me, because I did not think of using HTML code like this.

    If you don't know what it is, let me introduce to you. If not, please state the mistake if any. XSRF, obviously is a webapp attack method. It is not only applied to HTML, but all markup language.
    Let's see an example then you may know what the problem is.
    <img src="http://www.bookstore.com/order.php?isbn=817525766-0&quantity=100&submit=yes" height=0 width=0 />

    Although img is not including an image, the HTTP request is still send to the server. By setting the height and width of the img or using CSS style, the broken image can be hidden, the user even don't know such request has been sent. XSRF can force the user to updateing their profile, post new message or thread unknowingly. Sounds like not so dangerous, however it is more worse than that.

    Difference between XSS and XSRF

    Cross-Site Scripting (XSS) and XSRF are quite similar, isn't it? Actually they are not the same. XSS is try to either abuse client-side active scripting holes, or send privileged information to unknown site by inserting active code in HTML document.

    XSRF is not rely on client-side active scripting, it try to take unwanted, unapproved actions on a site where the user has some authority.

    It is difficult to filter content, because the XSRF attack may look like this:
    <img src="http://itisnotanattack.com/logo.jpg" height=0 width=0 />

    When your client requests logo.jpg, the file does not exist, but itisnotanattack.com server will redirect you to somewhere it like to show you.

    XSRF can also be used to attack servers behind firewalls. It is not just public webapps that are at risk.
    <img src="http://intranet/admin/purgeDB?confirm=yes" />

    If the attackers knows enough to make a URL and can get the admin open this file, then everything is done. Now you know how funny (dangerous) it is.

    20070406

    Hide disclaimer in CSE personal hp

    The day before yesterday, Billy try to hide the disclaimer in his personal homepage in CSE.
    Tom suggested that the html file without "</body></html>" can avoid the disclaimer include in the file.
    However, clearly, it is not satisfy the rule of W3C.
    The file is obviously not complete.

    Actually, I still cannot get the idea how it write the disclaimer in my file.
    I think the page has been pass to some program to add the disclaimmer before including in response packet.
    What it added to my file is as follow:
    <link ... href="/css/disclaimer.css" ... >
    <script ... src="/js/disclaimer.js" ... ></script>

    and
    <div></div><div></div>
    <div id="cse-disclaimer">
    ...
    </div><!-- #cse-disclaimer -->

    The .js file has a function disclaimer(), it may called at the end of the page (before </body></html>), this function will enable visibility of the disclaimer block.

    What I did to block the disclaimer again is very simple, create a .js file, content is similar to disclaimer.js.
    When the body is being loaded, call the block function in my .js file, and it will disable the visibility of the disclaimer.
    Is it easy? Yes, it is. However I spend whole day to test it, because I am not familar with JS.

    Start this blog for what?

    I am thinking these few day, if I should create this for making note on the interesting thing related to prog or security.
    (I expect more on security. it is very new and interesting to me)

    Hope I can post something everyday. :D
    Please feel free to share your opinion