2017-10-07 264 views
1

我知道这是一个很常见的问题,但我无法找到对我的问题有用的答案。如果有类似的东西,我会删除这篇文章。使用Octave/Matlab从csv文件中删除一行

我正在从Kaggle's 5000 Movies Database的movies.csv上使用Octave,我会删除预算或收入单元格中所有带零的行。我在阅读文件中的列时遇到了一些问题,所以我已经将收入列复制并粘贴到预算列 - 我当然想知道为什么八度将文本的部分标识为自治列,但现在不是我最紧急的麻烦。

更新:矩阵包含数字和字符串值,我会保留预算/收入大于零的行的所有数据。这里有一个例子,希望它是可以理解的。我正在处理一个已经没有标题的文件,但是我为了更好的理解而离开了它。

budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count                                                                                            
237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""name"": ""Fantasy""}, {""id"": 878, ""name"": ""Science Fiction""}]",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"": 2964, ""name"": ""future""}, {""id"": 3386, ""name"": ""space war""}, {""id"": 3388, ""name"": ""space colony""}, {""id"": 3679, ""name"": ""society""}, {""id"": 3801, ""name"": ""space travel""}, {""id"": 9685, ""name"": ""futuristic""}, {""id"": 9840, ""name"": ""romance""}, {""id"": 9882, ""name"": ""space""}, {""id"": 9951, ""name"": ""alien""}, {""id"": 10148, ""name"": ""tribe""}, {""id"": 10158, ""name"": ""alien planet""}, {""id"": 10987, ""name"": ""cgi""}, {""id"": 11399, ""name"": ""marine""}, {""id"": 13065, ""name"": ""soldier""}, {""id"": 14643, ""name"": ""battle""}, {""id"": 14720, ""name"": ""love affair""}, {""id"": 165431, ""name"": ""anti war""}, {""id"": 193554, ""name"": ""power relations""}, {""id"": 206690, ""name"": ""mind and soul""}, {""id"": 209714, ""name"": ""3d""}]",en,Avatar,"In the 22nd century, a paraplegic Marine is dispatched to the moon Pandora on a unique mission, but becomes torn between following orders and protecting an alien civilization.",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289}, {""name"": ""Twentieth Century Fox Film Corporation"", ""id"": 306}, {""name"": ""Dune Entertainment"", ""id"": 444}, {""name"": ""Lightstorm Entertainment"", ""id"": 574}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}, {""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""}]",2009-12-10,2787965087,162,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso_639_1"": ""es"", ""name"": ""Espa\u00f1ol""}]",Released,Enter the World of Pandora.,Avatar,7.2,11800                                                                                           
300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""name"": ""Fantasy""}, {""id"": 28, ""name"": ""Action""}]",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""name"": ""drug abuse""}, {""id"": 911, ""name"": ""exotic island""}, {""id"": 1319, ""name"": ""east india trading company""}, {""id"": 2038, ""name"": ""love of one's life""}, {""id"": 2052, ""name"": ""traitor""}, {""id"": 2580, ""name"": ""shipwreck""}, {""id"": 2660, ""name"": ""strong woman""}, {""id"": 3799, ""name"": ""ship""}, {""id"": 5740, ""name"": ""alliance""}, {""id"": 5941, ""name"": ""calypso""}, {""id"": 6155, ""name"": ""afterlife""}, {""id"": 6211, ""name"": ""fighter""}, {""id"": 12988, ""name"": ""pirate""}, {""id"": 157186, ""name"": ""swashbuckler""}, {""id"": 179430, ""name"": ""aftercreditsstinger""}]",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, has come back to life and is headed to the edge of the Earth with Will Turner and Elizabeth Swann. But nothing is quite as it seems.",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""name"": ""Jerry Bruckheimer Films"", ""id"": 130}, {""name"": ""Second Mate Productions"", ""id"": 19936}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2007-05-19,961000000,169,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500                                                                                           
245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""name"": ""Adventure""}, {""id"": 80, ""name"": ""Crime""}]",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name"": ""based on novel""}, {""id"": 4289, ""name"": ""secret agent""}, {""id"": 9663, ""name"": ""sequel""}, {""id"": 14555, ""name"": ""mi6""}, {""id"": 156095, ""name"": ""british secret service""}, {""id"": 158431, ""name"": ""united kingdom""}]",en,Spectre,"A cryptic message from Bond’s past sends him on a trail to uncover a sinister organization. While M battles political forces to keep the secret service alive, Bond peels back the layers of deceit to reveal the terrible truth behind SPECTRE.",107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""name"": ""Danjaq"", ""id"": 10761}, {""name"": ""B24"", ""id"": 69434}]","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""}, {""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2015-10-26,880674609,148,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""}, {""iso_639_1"": ""en"", ""name"": ""English""}, {""iso_639_1"": ""es"", ""name"": ""Espa\u00f1ol""}, {""iso_639_1"": ""it"", ""name"": ""Italiano""}, {""iso_639_1"": ""de"", ""name"": ""Deutsch""}]",Released,A Plan No One Escapes,Spectre,6.3,4466                                                                                           
250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""name"": ""Crime""}, {""id"": 18, ""name"": ""Drama""}, {""id"": 53, ""name"": ""Thriller""}]",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853, ""name"": ""crime fighter""}, {""id"": 949, ""name"": ""terrorist""}, {""id"": 1308, ""name"": ""secret identity""}, {""id"": 1437, ""name"": ""burglar""}, {""id"": 3051, ""name"": ""hostage drama""}, {""id"": 3562, ""name"": ""time bomb""}, {""id"": 6969, ""name"": ""gotham city""}, {""id"": 7002, ""name"": ""vigilante""}, {""id"": 9665, ""name"": ""cover-up""}, {""id"": 9715, ""name"": ""superhero""}, {""id"": 9990, ""name"": ""villainess""}, {""id"": 10044, ""name"": ""tragic hero""}, {""id"": 13015, ""name"": ""terrorism""}, {""id"": 14796, ""name"": ""destruction""}, {""id"": 18933, ""name"": ""catwoman""}, {""id"": 156082, ""name"": ""cat burglar""}, {""id"": 156395, ""name"": ""imax""}, {""id"": 173272, ""name"": ""flood""}, {""id"": 179093, ""name"": ""criminal underworld""}, {""id"": 230775, ""name"": ""batman""}]",en,The Dark Knight Rises,"Following the death of District Attorney Harvey Dent, Batman assumes responsibility for Dent's crimes to protect the late attorney's reputation and is subsequently hunted by the Gotham City Police Department. Eight years later, Batman encounters the mysterious Selina Kyle and the villainous Bane, a new terrorist leader who overwhelms Gotham's finest. The Dark Knight resurfaces to protect a city that has branded him an enemy.",112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""name"": ""Warner Bros."", ""id"": 6194}, {""name"": ""DC Entertainment"", ""id"": 9993}, {""name"": ""Syncopy"", ""id"": 9996}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2012-07-16,1084939099,165,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106                                                                                            
260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""name"": ""Adventure""}, {""id"": 878, ""name"": ""Science Fiction""}]",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"": 839, ""name"": ""mars""}, {""id"": 1456, ""name"": ""medallion""}, {""id"": 3801, ""name"": ""space travel""}, {""id"": 7376, ""name"": ""princess""}, {""id"": 9951, ""name"": ""alien""}, {""id"": 10028, ""name"": ""steampunk""}, {""id"": 10539, ""name"": ""martian""}, {""id"": 10685, ""name"": ""escape""}, {""id"": 161511, ""name"": ""edgar rice burroughs""}, {""id"": 163252, ""name"": ""alien race""}, {""id"": 179102, ""name"": ""superhuman strength""}, {""id"": 190320, ""name"": ""mars civilization""}, {""id"": 195446, ""name"": ""sword and planet""}, {""id"": 207928, ""name"": ""19th century""}, {""id"": 209714, ""name"": ""3d""}]",en,John Carter,"John Carter is a war-weary, former military captain who's inexplicably transported to the mysterious and exotic planet of Barsoom (Mars) and reluctantly becomes embroiled in an epic conflict. It's a world on the brink of collapse, and Carter rediscovers his humanity when he realizes the survival of Barsoom and its people rests in his hands.",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2012-03-07,284139100,132,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124                                                                                            
258000000,"[{""id"": 14, ""name"": ""Fantasy""}, {""id"": 28, ""name"": ""Action""}, {""id"": 12, ""name"": ""Adventure""}]",http://www.sonypictures.com/movies/spider-man3/,559,"[{""id"": 851, ""name"": ""dual identity""}, {""id"": 1453, ""name"": ""amnesia""}, {""id"": 1965, ""name"": ""sandstorm""}, {""id"": 2038, ""name"": ""love of one's life""}, {""id"": 3446, ""name"": ""forgiveness""}, {""id"": 3986, ""name"": ""spider""}, {""id"": 4391, ""name"": ""wretch""}, {""id"": 4959, ""name"": ""death of a friend""}, {""id"": 5776, ""name"": ""egomania""}, {""id"": 5789, ""name"": ""sand""}, {""id"": 5857, ""name"": ""narcism""}, {""id"": 6062, ""name"": ""hostility""}, {""id"": 8828, ""name"": ""marvel comic""}, {""id"": 9663, ""name"": ""sequel""}, {""id"": 9715, ""name"": ""superhero""}, {""id"": 9748, ""name"": ""revenge""}]",en,Spider-Man 3,"The seemingly invincible Spider-Man goes up against an all-new crop of villain – including the shape-shifting Sandman. While Spider-Man’s superpowers are altered by an alien organism, his alter ego, Peter Parker, deals with nemesis Eddie Brock and also gets caught up in a love triangle.",115.699814,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""name"": ""Laura Ziskin Productions"", ""id"": 326}, {""name"": ""Marvel Enterprises"", ""id"": 19551}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2007-05-01,890871626,139,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""}]",Released,The battle within.,Spider-Man 3,5.9,3576                                                                                            
260000000,"[{""id"": 16, ""name"": ""Animation""}, {""id"": 10751, ""name"": ""Family""}]",http://disney.go.com/disneypictures/tangled/,38757,"[{""id"": 1562, ""name"": ""hostage""}, {""id"": 2343, ""name"": ""magic""}, {""id"": 2673, ""name"": ""horse""}, {""id"": 3205, ""name"": ""fairy tale""}, {""id"": 4344, ""name"": ""musical""}, {""id"": 7376, ""name"": ""princess""}, {""id"": 10336, ""name"": ""animation""}, {""id"": 33787, ""name"": ""tower""}, {""id"": 155658, ""name"": ""blonde woman""}, {""id"": 162219, ""name"": ""selfishness""}, {""id"": 163545, ""name"": ""healing power""}, {""id"": 179411, ""name"": ""based on fairy tale""}, {""id"": 179431, ""name"": ""duringcreditsstinger""}, {""id"": 215258, ""name"": ""healing gift""}, {""id"": 234183, ""name"": ""animal sidekick""}]",en,Tangled,"When the kingdom's most wanted-and most charming-bandit Flynn Rider hides out in a mysterious tower, he's taken hostage by Rapunzel, a beautiful and feisty tower-bound teen with 70 feet of magical, golden hair. Flynn's curious captor, who's looking for her ticket out of the tower where she's been locked away for years, strikes a deal with the handsome thief and the unlikely duo sets off on an action-packed escapade, complete with a super-cop horse, an over-protective chameleon and a gruff gang of pub thugs.",48.681969,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""name"": ""Walt Disney Animation Studios"", ""id"": 6125}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2010-11-24,591794936,100,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,They're taking adventure to new lengths.,Tangled,7.4,3330                                                                                           
280000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""name"": ""Adventure""}, {""id"": 878, ""name"": ""Science Fiction""}]",http://marvel.com/movies/movie/193/avengers_age_of_ultron,99861,"[{""id"": 8828, ""name"": ""marvel comic""}, {""id"": 9663, ""name"": ""sequel""}, {""id"": 9715, ""name"": ""superhero""}, {""id"": 9717, ""name"": ""based on comic book""}, {""id"": 10629, ""name"": ""vision""}, {""id"": 155030, ""name"": ""superhero team""}, {""id"": 179431, ""name"": ""duringcreditsstinger""}, {""id"": 180547, ""name"": ""marvel cinematic universe""}, {""id"": 209714, ""name"": ""3d""}]",en,Avengers: Age of Ultron,"When Tony Stark tries to jumpstart a dormant peacekeeping program, things go awry and Earth’s Mightiest Heroes are put to the ultimate test as the fate of the planet hangs in the balance. As the villainous Ultron emerges, it is up to The Avengers to stop him from enacting his terrible plans, and soon uneasy alliances and unexpected action pave the way for an epic and unique global adventure.",134.279229,"[{""name"": ""Marvel Studios"", ""id"": 420}, {""name"": ""Prime Focus"", ""id"": 15357}, {""name"": ""Revolution Sun Studios"", ""id"": 76043}]","[{""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2015-04-22,1405403694,141,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,A New Age Has Come.,Avengers: Age of Ultron,7.3,6767                                                                                           
250000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""name"": ""Fantasy""}, {""id"": 10751, ""name"": ""Family""}]",http://harrypotter.warnerbros.com/harrypotterandthehalf-bloodprince/dvd/index.html,767,"[{""id"": 616, ""name"": ""witch""}, {""id"": 2343, ""name"": ""magic""}, {""id"": 3872, ""name"": ""broom""}, {""id"": 3884, ""name"": ""school of witchcraft""}, {""id"": 6333, ""name"": ""wizardry""}, {""id"": 10164, ""name"": ""apparition""}, {""id"": 10791, ""name"": ""teenage crush""}, {""id"": 12564, ""name"": ""werewolf""}]",en,Harry Potter and the Half-Blood Prince,"As Harry begins his sixth year at Hogwarts, he discovers an old book marked as 'Property of the Half-Blood Prince', and begins to learn more about Lord Voldemort's dark past.",98.885637,"[{""name"": ""Warner Bros."", ""id"": 6194}, {""name"": ""Heyday Films"", ""id"": 7364}]","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""}, {""iso_3166_1"": ""US"", ""name"": ""United States of America""}]",2009-07-07,933959197,153,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,Dark Secrets Revealed,Harry Potter and the Half-Blood Prince,7.4,5293                                                                                           

所以这是我的代码,它打印了许多零的新文件,而不可理解的模式(或者看起来是给我)。

fid = fopen('original_filepath', 'r'); 
fout = fopen('new_filepath', 'w+'); 

tline = fgetl(fid); 
while ~feof(fid) #here at first I used ischar but 
        #it returned an invalid stream number for the ending tline 
    dollars = strread(tline, '%f', 'delimiter', ','); 
    budget = dollars(1); 
    revenue = dollars(2); 
    if budget = 0 || revenue = 0 
     fprintf(fout, '%s\n', tline); 
    end 
    tline = fgetl(fid); 
end 

fclose(fid); 
fclose(fout); 

我知道strread不推荐,但textscan结果内容离别更成问题。或者,也许我太痴迷像dollars(k),我觉得非常方便。

+1

因为有人需要注册才能下载CSV这是不太可能有人会在这里帮助。你为什么不使用csvread或dlmread?在[IO包](https://octave.sourceforge.io/io/overview.html)中,ale也有许多导入功能。如果您希望有人积极地帮助您,您应该创建一个小数字输入CSV,并显示您的问题。另请参阅[MCVE] – Andy

+0

感谢您的回答。 可以看到数据库的预览,所以我认为在Kaggle网站上看起来更简单。我会更新我的问题。 我不知道(我没有发现任何关于)dlmread,我会看到文档。 – Prn

+0

CSV的屏幕截图不是最好的选择,因为它不允许编写代码来加载它。我看到这个CSV包含JSON数据,所以你也有兴趣使用jsonlab或rapidjson-octave来解码这些数据吗? – Andy

回答

0

使用octave-forge io package

pkg load io 
c = csv2cell ("prn.csv"); 

header = c(1, :); 
c(1, :) = []; % strip header 

% get column ids 
budget_col = find (strcmp (header, "budget")) 
revenue_col = find (strcmp (header, "revenue")) 

budget = cell2mat (c(:, budget_col)); 
revenue = cell2mat (c(:, revenue_col)); 

% remove line where budget or revenue are zero 
c (budget == 0 | revenue == 0, :) = []; 

% show remaining homepage 
c (:, 3) 

ans = 
{ 
    [1,1] = http://www.avatarmovie.com/ 
    [2,1] = http://disney.go.com/disneypictures/pirates/ 
    [3,1] = http://www.sonypictures.com/movies/spectre/ 
    [4,1] = http://www.thedarkknightrises.com/ 
    [5,1] = http://movies.disney.com/john-carter 
    [6,1] = http://www.sonypictures.com/movies/spider-man3/ 
    [7,1] = http://disney.go.com/disneypictures/tangled/ 
    [8,1] = http://marvel.com/movies/movie/193/avengers_age_of_ultron 
    [9,1] = http://harrypotter.warnerbros.com/harrypotterandthehalf-bloodprince/dvd/index.html 
} 

1

你的代码有多个错误。请试试这个(未经测试)的代码,并进入行

fprintf(fout, '%s\n', [num2str(budget), ',', num2str(revenue)]); 

,看是否正确的值从源读取和写入目的地。如果您遇到问题,请更新您的问题。

fid = fopen('original_filepath', 'r'); 
fout = fopen('new_filepath', 'w+'); 


while ~feof(fid) #here at first I used ischar but 
        #it returned an invalid stream number for the ending tline 

    tline = fgetl(fid); % read current line 
    dollars = strread(tline, '%f', 'delimiter', ','); 
    budget = dollars(1); 
    revenue = dollars(2); 
    if ~(budget*revenue == 0) % ensure neither budget nor revenue are zero 
     fprintf(fout, '%s\n', [num2str(budget), ',', num2str(revenue)]); 
    end 


    tline = fgetl(fid); 
end 

fclose(fid); 
fclose(fout); 

也请考虑使用strread的csvread()而不是为你的数据似乎是在csv格式。

+0

这是否解决了您的问题? – Marcus

+0

谢谢。我试图在编辑时运行代码,它只保留预算/收入的价值不是零,而是搞乱了序列除了查看数据,我认为它有点过多了 我想过csvread,但是我的文件也没有数字值,但是我看到预览的重要性,正如Andy指出的那样,我会编辑我的帖子。 如果你能告诉我(甚至是简单的)哪些是我的其他错误,我将不胜感激。 – Prn