[코딜리티] codility lesson 5 Frefix Sums - GenonimRangeQuery 100%

문제.

Programming language: 
A DNA sequence can be represented as a string consisting of the letters ACG and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types ACG and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence?
The DNA sequence is given as a non-empty string S = S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive).
For example, consider string S = CAGCCTA and arrays P, Q such that:
P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6
The answers to these M = 3 queries are as follows:
  • The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
  • The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
  • The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide A whose impact factor is 1, so the answer is 1.
Assume that the following declarations are given:
struct Results { int * A; int M; };
Write a function:
struct Results solution(char *S, int P[], int Q[], int M);
that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.
The sequence should be returned as:
  • a Results structure (in C), or
  • a vector of integers (in C++), or
  • a Results record (in Pascal), or
  • an array of integers (in any other programming language).
For example, given the string S = CAGCCTA and arrays P, Q such that:
P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6
the function should return the values [2, 4, 1], as explained above.
Assume that:
  • N is an integer within the range [1..100,000];
  • M is an integer within the range [1..50,000];
  • each element of arrays P, Q is an integer within the range [0..N − 1];
  • P[K] ≤ Q[K], where 0 ≤ K < M;
  • string S consists only of upper-case English letters A, C, G, T.
Complexity:
  • expected worst-case time complexity is O(N+M);
  • expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
Elements of input arrays can be modified.
Copyright 2009–2017 by Codility Limited. All Rights Reserved. Unauthorized copying, publication or disclosure prohibited.

풀이.

// you can also use imports, for example:
// import java.util.*;
// you can write to stdout for debugging purposes, e.g.
// System.out.println("this is a debug message");
class Solution {
    public int[] solution(String S, int[] P, int[] Q) {
        // write your code in Java SE 8
        int[] convertA = new int[S.length()];
        int[] convertC = new int[S.length()];
        int[] convertG = new int[S.length()];
        int[] result = new int[P.length];
        int A = -1; int C = -1; int G = -1;
        
        
        for(int i = 0 ; i < S.length() ; i++){
            if(S.charAt(i) == 'A') A=i;
            if(S.charAt(i) == 'C') C=i;
            if(S.charAt(i) == 'G') G=i;
            convertA[i] = A;
            convertC[i] = C;
            convertG[i] = G;
        }
        
        for( int i = 0 ; i < P.length ; i++){
            result[i] = 4;
            if(convertA[Q[i]] <= Q[i] && convertA[Q[i]] >= P[i]){ result[i] = 1; continue;}
            if(convertC[Q[i]] <= Q[i] && convertC[Q[i]] >= P[i]){ result[i] = 2; continue;}
            if(convertG[Q[i]] <= Q[i] && convertG[Q[i]] >= P[i]){ result[i] = 3; continue;}
        }
        
        return result;
    }

설명.

A,C,G 그리고 T 로 이루어진 문자열이 있다.
각각은 1,2,3 그리고 4의 값을 가진다. P 와 Q 배열은 문자열의 시작 과 끝 위치를 나타낸다.
P~Q 사이에 가장 작은 값을 찾아서 기록한 배열을 출력 하여라.

실패 과정.

C의 포인터를 이용하려고 하였으나. 내가 까먹은건지 코딜리티가 이상한건지 잘 안되는 듯 하여 JAVA 로 다시 만듬.

자바는 문자열을 문자배열로 인식하지 않는 건지 코딜리티가 그런건지 잘 모르겠으나 안먹음.

결국. 내가 생각한 풀이법이 안되어서 심플한 탐색을 해보았더니, 성능 부분에서 모두 타임아웃이 걸림.

java string 객체 메소드에서 힌트를 얻어볼까 하여 String 객체를 뜯어 보니 각 메소드들이나 구조체 자체가 타임아웃을 피하기 어려워 보임.

전혀 다른 방식으로 재설계 함.

T 를 기본값으로 정하고 연산속도를 높임.
우선순위가 높으면 바로 넘어가 버려서 최대한 연산속도를 확보.
네스티드 루프 를 피하기 위한 방법
- 각각의 종료 시점에 각 DNA 의 최종 위치를 기록한 배열 을 만듬.
- DNA 최종위치가 시작위치 이후에 있는지 검사.

끝.

댓글

이 블로그의 인기 게시물

[코딜리티] codility lesson 5 Frefix Sums - MinAvgTwoSlice 100%

구글 블로그 ( blogspot ) 라벨 위젯 을 카테고리 처럼 만들기

[코딜리티] codility lesson 7 Stacks and Queues - StoneWall 100%