We are given two arrays, arr1[] and arr2[], that contains n, m number of distinct elements in unsorted order. We are supposed to find, whether the given array arr2 is a subset of another given array arr1 or not.
Example –
Input: arr1[] = {1,6,18,20,3,5,7} int arr2[] = {18,3,7,6} Output: the arr2 is subset of arr1 Input: arr1[] = {1,6,18,20,3,5,7} int arr2[] = {8,3,7,6} Output: the arr2 is not a subset of arr1
This problem can be solved using two methods –
- The Naive Approach. Time complexity – O(N*M)
- The optimized Approach. Time complexity – O(MlogM + NlogN)
Naive Approach
The naive approach would be comparing each element of the arr2 to each and every element of arr1. This would involve interacting over the N elements of arr1, M number of times, where M is the number of elements in arr2.
Since this approach involves the nested loop, the time complexity of this approach reaches O(N*M).
Algorithm
is_subset(arr1, arr2, n, m) { for(i=0; i<m; i++) { for(j=0; j<n; j++) { if(arr2[j] == arr1[i]) break; } if(j==m) return false } return true }
Optimized Approach
The optimized and fast approach to solving this problem (of finding whether a given array is a subset of another array) will involve, first sorting both the arrays and then comparing whether the arr2 is a subset of arr1 in O(N+M) time as described below –
- Initialize both the arrays (arr1, arr2)
- Sort both the arrays (arr1, arr2).
- Initialize pointer i, j to the first element of arr1, arr2.
- If arr1[i] == arr[j], check for next elements by increment both the pointer i,j by 1
- Else if arr1[i] < arr2[j], increment only the ith pointer.
- Else, break because this symbolizes, the arr2 is not a subset of arr1.
Algorithm
issubset(arr1, arr2, n, m) { sort(arr1, arr1+n); sort(arr2, arr2+m); i=0 j=0 while(i<n && j<m) { if(arr1[i] == arr2[j]){ i++ j++ } else if(arr1[i] < arr2[j]) i++ else break } if(j == m) print("arr2[] is a subset of arr1[]") else print("arr2[] is not a subset of arr1[]") }
Implementation of the above algorithm
//program to check if array arr1[] //is subset of second array arr2[] #include <bits/stdc++.h> using namespace std; int issubset(int *arr1, int *arr2, int n, int m) { int i,j; //sort arr1 and arr2. This will take O(nlogn + mlogn) time sort(arr1, arr1+n); sort(arr2, arr2+m); for(i=0, j=0; i<n, j<m; ){ //if elements in both the array are same increment // pointer for both the array if(arr1[i] == arr2[j]){ i++; j++; } //if current element of the arr1[] is samller than arr2[] //the check the next element of array1 else if(arr1[i] < arr2[j]) i++; //if element of arr1[] is greater than arr2[], then //arr2[] is not subset of arr1[] else break; } //all the elements of arr2[] are found in arr1[] //then j will reach the end of arr2[] size if(j == m) cout<<"arr2[] is a subset of arr1[]"<<endl; else cout<<"arr2[] is not a subset of arr1[]"<<endl; return 0; } int main() { int arr1[] = {1,6,18,20,3,5,7}; int arr2[] = {18,3,7,6}; int n = sizeof(arr1)/sizeof(arr1[0]); int m = sizeof(arr2)/sizeof(arr2[0]); issubset(arr1, arr2, n, m); return 0; }
Output
arr2[] is a subset of arr1[]
Time Complexity
The run time complexity of the above approach is O(NlogN + MlognM) ≈ O(NlogN), where N is the number of elements in arr1[] and M is the number of elements in arr2[]. This time complexity arose due to the sorting we did in the beginning. The time complexity of the loop we are using for finding whether the arr2 is a subset of arr1 is O(N + M)