C++如何简单快速去除容器中的重复元素

2019 年 09 月 13 日

2672 次浏览

1963字数

假设在vector strs中有一些单词(全小写)，包含重复出现的元素，现在需要统计其中出现过哪些单词，那么有什么简单高效的去除方法呢？
这里推荐两种方法:
一种是用algorithm的函数
先用sort排序，让重复元素相邻，再用unique把重复元素移至容器末尾，最后用erase把末尾重复元素删除。
源码如下：

include

#include<iostream>
#include<string>
#include<vector>
using namespace std;

int main(){
    string str[] = {"word","yellow","at","number","yellow","at","student","number"};
    vector<string> strs(str,str+8);
    //去重复
    sort(strs.begin(),strs.end());  
    auto end_unique = unique(strs.begin(),strs.end());  
    strs.erase(end_unique,strs.end());

    return 0;
}

如果在其中插入输出语句，结果为：
初始化后： word yellow at number yellow at student number
sort后：at at number number student word yellow yellow
unique后：at number student word yellow number at yellow
erase后：at number student word yellow

另一种是用set容器转存
因为set容器默认不会存入重复元素，所以直接用strs初始化set容器即可达到去重复的目的
源码如下:

#include<algorithm>
#include<iostream>
#include<string>
#include<vector>
#include<set>

using namespace std;

int main(){
    string str[] = {"word","yellow","at","number","student","at","word","number"};
    vector<string> strs(str,str+8);

    set<string> se(strs.begin(),strs.end());

    return 0;
}

如果在末尾插入输出语句，结果为：
strs：word yellow at number yellow at student number
se：at number student word yellow
相比于上面的方法，用set转存的优点是一条语句就能完成去重复，缺点是原容器strs不会发生改变，只是把去重复的结果放进了se中。
注意：这两种方法虽然简单，但都可能会改变strs中元素的相对顺序，如果不想改变相对顺序，可以用下面这个方法。
把strs中元素依次存入set容器中，如果某个元素存入失败，就从strs中把这个元素删除。即可达到不改变顺序去除strs中的重复元素。
源码如下:

#include<algorithm>
#include<iostream>
#include<string>
#include<vector>
#include<set>

using namespace std;

int main(){
    string str[] = {"word","yellow","at","number","yellow","at","student","number"};
    vector<string> strs(str,str+8);

    set<string> se;
    for(vector<string>::iterator it = strs.begin();it!=strs.end();){
        pair<set<string>::iterator,bool> ret = se.insert(*it);
        if(ret.second==false){strs.erase(it);continue;}
        it++;
    }

    return 0;
}

C++如何简单快速去除容器中的重复元素